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Preface 


The  1994  International  Conference  on  Optical  Computing  took  place  at  the  Edinburgh 
Conference  Centre,  Heriot-Watt  University,  UK  on  22-25  August. 

The  conference  series  is  organized  under  the  auspices  of  a  steering  committee 
established  by  the  International  Commission  for  Optics.  Previous  meetings  include 
those  at  Minsk,  Belarus  (1992),  Kobe,  Japan  (1990)  and  Toulon,  France  (1988).  In 
alternate  years  the  conference  is  held  in  the  USA,  as  a  Topical  Meeting  of  the  Optical 
Society  of  America — Palm  Springs  (1993),  Sait  Lake  City  (1991,  1989).  The  subject 
itself,  however,  can  be  traced  back  to  the  early  1960s,  and  many  meetings,  some  with 
published  proceedings,  were  held  prior  to  the  ICO  and  OSA  series. 

The  particular  topics  covered  in  Optical  Computing  conferences  have  varied  over 
the  years,  as  new  optical  techniques  have  become  recognized  and  more  especially  as 
new  devices  have  become  available.  For  example  one  of  the  earliest  meetings,  the  1964 
Symposium  on  Optical  and  Electro-Optical  Information  Processing,  was  held  in  Boston, 
Massachusetts,  just  four  years  after  the  demonstration  of  the  laser.  The  meeting  included 
a  wide  range  of  topics  from  coherent  optical  processing  and  synthetic  aperture  radar 
processing  through  to  optical  storage  and  ideas  for  optically  bistable  devices.  Fourier 
and  acousto-optic-based  analogue  processing  were  to  dominate  the  subject  through  the 
1970s;  optical  correlators  continue  to  evolve  and  new  developments  may  be  found  in 
the  papers  in  Part  III  of  the  present  Proceedings.  Acousto-optic-  (Bragg-cell-)  based 
matrix-vector  processors  also  began  to  appear  in  the  mid-1970s.  These  topics  still 
formed  the  mainstream  for  conferences  through  to  1983.  Topics  such  as  tomography  and 
radioastronomy  also  appeared  briefly  during  the  1970s  meetings,  but  have  since  been 
treated  in  conferences  of  their  own,  as  has  much  of  integrated  optics. 

By  1984  the  advantages  of  optical  interconnections  within  computing,  using  fibres, 
waveguides  or  free  space,  came  under  the  umbrella  of  optical  computing.  The  guided- 
wave  work  has  tended  recently  to  be  covered  in  Photonic  Switching  meetings,  but  free- 
space  interconnection  and  associated  switching  networks  formed  a  major  part  of  the  1994 
meeting — Part  II  herein. 

1985  saw  the  introduction,  at  the  Lake  Tahoe  conference,  of  sessions  on  optically 
bistable  devices  and  systems.  Low-power  optical  bistability  had  been  researched  for 
five  years,  and  optical  logic  and  volatile  memory  devices  were  being  studied.  The 
possibilities  for  all-optical  computing  stemmed  from  this  work  and  formed  a  significant 
part  of  meetings  through  to  the  1990s.  Work  continues  in  this  area;  new  devices  are 
covered  in  Part  VI.  The  trend,  however,  has  been  toward  ‘smart-pixel’  technologies, 
with  optically  interfaced  arrays  of  electronic  cells.  These  were  first  reported  at  the  1991 
Topical  Conference  at  Salt  Lake  City;  they  are  treated  here  in  Part  V. 
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With  the  injection  of  further  enthusiasm,  and  new  architectural  opportunities, 
provided  by  smart  pixels,  digital  optical  computing  remains  one  of  the  major  themes 
of  the  subject.  Part  I  of  this  Proceedings  contains  papers  on  algorithms,  architectures 
and  a  range  of  implementations  of  digital  optical  schemes  for  parallel  processors. 

In  addition  to  bistability,  1985  saw  the  development  of  spatial  light  modulators  (Part 
V)  and  their  use  initially  as  inputs  to  Fourier  processors.  Optical  neural  networks  (Part  IV) 
were  also  first  introduced  at  the  ’85  meeting;  from  the  first  digital  Hopfield  demonstrators 
advantage  has  been  taken  of  Fourier  (holographic)  techniques  and  the  reconfigurability 
of  interconnections  offered  by  advances  in  photorefractive  materials.  The  significance  of 
both  device  developments  and  of  materials  advances  has  been  long  recognized  in  optical 
computing.  The  1990  Kobe  conference  was  especially  strongly  influenced  by  these  areas, 
and  the  recent  smart-pixel  possibilities  owe  much  to  developments  in  semiconductor 
growth  and  fabrication  techniques. 

The  purpose  of  the  Optical  Computing  conference  series  is  both  to  enable  the 
communication  of  state-of-the-art  information  among  delegates  from  around  the  world, 
and  to  foster  interactions  and  friendships  among  members  of  the  scientific  community. 
The  Heriot-Watt  conference  was  attended  by  over  250  delegates,  originating  from  24 
countries.  There  were  9  extended  invited  talks,  52  contributed  talks  and  135  poster 
presentations  during  the  4-day  meeting,  including  17  post-deadline  presentations.  The 
potential  applications  for  optics  in  computing  were  discussed  throughout,  and  as  a  foil 
an  invited  talk  on  ‘Architecture  design  and  implementation  issues  for  massively  parallel 
processors’  was  presented  by  Steve  Nelson  of  Cray  Research  Inc.  There  was  a  healthy 
mix  of  attendees  from  both  the  academic  and  industrial  research  communities,  with 
papers  covering  topics  from  fundamental  information  theory  in  electronics  and  photonics, 
through  optical  phenomena,  devices,  architectures  and  processor  implementations,  to 
integration  and  packaging  issues. 

It  is  my  belief  that  both  the  technical  and  social  aspects  of  the  meeting  were 
appreciated  by  all  delegates.  I  must  thank  the  personnel  of  the  Edinburgh  Conference 
Centre  at  Heriot-Watt  for  their  professionalism,  which  established  the  framework  for  the 
smooth  organization  of  the  meeting.  The  conference  could  not  have  taken  place  without 
the  backing  of  the  Department  of  Physics  at  Heriot-Watt,  and  the  willing  assistance  of 
many  members  of  the  academic,  technical  and  secretarial  staff,  and  graduates  of  the 
department.  It  goes  without  saying  that  the  efforts  over  a  long  period  of  the  Local 
Organizing  Committee  (Andy  Walker,  Mo  Taghizadeh,  Frank  Tooley,  John  Snowdon, 
Jimmy  Smith  and  Janice  McClelland)  are  thoroughly  appreciated.  My  thanks  to  all. 

For  assistance  with  the  production  of  this  Proceedings  I  should  particularly  like  to 
thank  Janice  McClelland  and  Florence  Jensen. 


Brian  Wherrett 
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Abstract.  The  behaviour  of  noisy  electronic  and  photonic  channels  are  signifi¬ 
cantly  different.  This  is  due  to  the  fact  that  bosons  and  fermions  obey  different 
statistics.  For  one  dimensional  signals  quantum  effects  restrict  the  information 
capacity  of  channels  at  input  data  rate  >  10^'^nit/s.  Below  this  rate  both  types 
of  channels  behave  classically.  At  higher  input  data  rates  the  energy  required  for 
transmission  of  information  is  2  -  4  orders  less  in  one- dimensional  boson  channel 
as  compared  with  the  corresponding  one- dimensional  fermion  channel.  Three- 
dimensional  optical  information  channels  are  free  of  quantum  limitations  up  to 
information  rates  ~  10^^ nit /cm^s. 


1.  Introduction 

The  basic  results  in  information  theory  concerning  the  channel  capacities,  transmitting 
and  processing  rates,  etc.  have  been  obtained  by  use  of  classical  physics  methods.  The 
investigation  of  the  information  features  of  the  electromagnetic  (photon)  channels  in  the 
presence  of  noise,  treated  on  the  basis  of  Bose-Einstein  statistics  [1],  shows  quantum 
effects  to  be  significant  at  certain  conditions.  However,  at  low  signal-to-noise  ratios 
photonic  channels  behave  classically  and  are  described  well  by  the  Shannon  information 
theory.  The  latter  does  not  differentiate  between  photon  and  electron  channels  from  the 
point  of  view  of  the  quantum  nature. of  the  information  carrying  agent. 

Since  photons  and  electrons  obey  different  statistics,  the  relationships  between 
signal  and  noise  are  different  for  these  two  cases.  For  photons  the  power  of  the  total 
radiation  (signal  and  noise)  can  be  considered  as  an  independent  sum  of  the  correspond¬ 
ing  values  of  the  signal  and  noise  powers;  because  the  occupation  of  any  raicrostate 
either  by  signal  photons  or  by  thermal  (noise)  photons  is  statistically  independent.  But 
this  is  not  so  for  electrons;  owing  to  the  Pauli  principle  a  definite  microstate  of  noise 
restricts  the  variety  of  possible  microstates  of  signal  and  vice  versa.  Obviously  this  dif¬ 
ference  leads  in  the  quantum  limit  to  different  possibilities  for  transmitting  information 
through  electronic  and  photonic  channels  with  presence  of  noise.  This  was  underlined  in 
[1].  However,  to  date  we  know  of  no  corresponding  consideration  for  the  case  of  fermion 
channels.  Meanwhile,  such  a  consideration  becomes  more  necessary  as  the  speed  of  infor¬ 
mation  processing  sj^stems  increases  in  line  with  decrease  of  the  physical  extent  (micro- 
and  nano-structures)  of  electronic  and  opto-electronic  devices  that  leads  to  high  energy 
densities  of  the  data  flows.  In  the  present  short  length  paper  we  summarize  our  results 
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on  tlie  general  information  features  of  the  fermion  (electron)  channels  with  an  equilib¬ 
rium  noise  studied  on  the  basis  of  Fermi-Dirac  stcitistics.  The  coj-responding  properties 
of  the  fermion  and  boson  (all-optical)  channels  are  also  compared.  The  results  are  inter¬ 
esting  for  electronics  alone  and  give  a  fundamental  basis  for  the  comparison  with  other 
technologies. 

2.  Formulation  of  the  problem 

Let  us  consider  an  information  channel  consisting  of  a  transmitter,  an  information  trans¬ 
mission  agent  or  physical  carrier  with  a  certain  space-time  extent,  i.e.  the  channel  m 
itself,  and  a  receiver.  It  is  assumed  that  the  role  of  the  physical  carrier  is  played  by 
a  quantum  ensemble  governed  by  either  Fermi-Dirac  or  Bose- Einstein  statistics.  The 
transmission  of  data  in  such  a  channel  means  the  formation  of  a  certain  microstate  of 
ensemble,  say,  in  terms  of  occupation  numbers  assigned  to  energy  levels  of  the  channel 
degrees  of  freedom.  VVe  also  assume  that  the  physical  carrier  of  information  is  affected 
by  some  noise  sources.  Due  to  this  noise  the  microstate  of  the  physical  carriei  deter¬ 
mined  by  the  receiver  at  the  channel  output  will  be  different  from  that  introduced  by 
the  transmitter  at  the  input. 

Abstracting  ourselves  from  possible  physical  limitations  of  both  transmitter  and 
receiver,  we  aim  to  find  out  the  limitations  related  only  to  the  information  channel  in 
itself  with  respect  to  its  quantum  properties  and  the  existence  of  noise.  As  an  example 
of  fermion  channel  we  consider  here  an  electron  channel  which  is  one-dimensional  due  to 
its  nature. 

Accounting  for  the  specific  of  the  problem,  the  energy  distribution  over  the  signal 
degrees  of  freedom  is  considered  in  the  ’Time- frequency”  phase  space.  For  the  case  of 
one-dimensional  (ID)  stationary  temporal  signal,  its  energy  is  represented  by  its  Fourier 
harmonics.  We  introduce  the  quantum  properties  into  the  information  channel  allowing 
the  energies  of  Fourier  harmonics  (channel’s  degrees  of  freedom)  be  quantized. 

Since  we  deal  with  an  ensemble  (of  particles,  signal  microstates,  degrees  of  freedom, 
etc.)  we  must  speak  of  the  mean  occupation  numbers  of  the  ensemble  elements.  Thus, 
from  more  general  point  of  view  the  choice  for  the  ’’time-frequency  phase  space  of  the 
physical  carrier  of  information  instead  of  the  ’’coordinate-momentum  phase  space  means 
no  more  than  a  replacement  of  the  mean  over  the  ensemble  by  the  mean  over  the  time 
(Ergodic  theorem). 

In  accordance  with  the  definition  of  information,  this  is  the  average  measure  of  un- 
certaint}'^  for  the  prevision  of  the  result  of  completely  random  nature  (see,  for  example, 
original  work  of  Shannon  [2]  and  the  interpretation  on  the  basis  of  the  Brillouin  negen- 
tropy  principle  [3]).  The  mathematical  expression  for  the  value  of  information  has  the 
form  of  an  entropy  equation  in  thermodynamics.  The  only  difference  is  that  in  statistical 
thermodynamics  the  entropy  is  understood  as  a  measure  of  disorder. 

The  information  channel  capacity  is  the  maximum  amount  of  information  trans¬ 
mitted  by  a  channel  per  unit  time  and  is  given  as: 

C  =  (1) 

r 

where  Hmax  -  Hq  is  the  so-called  entropy  defect  of  the  physical  system  considered  and  r 
is  the  signal  duration.  The  entropy  value  Ho  corresponds  to  the  case  when  the  channel 
input  is  influenced  by  a  determinate  signal.  In  contrast,  Hmax  is  the  maximum  possible 
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value  of  entropy  of  the  system  which  corresponds  to  its  thermodynamic  equilibrium,  i.e. 
when  probabilities  of  the  system  microstates  are  given  by  the  Gibbs  distribution.  Such 
an  equilibrium  channel  state  arises  under  the  excitation  of  the  channel  completely  at 
random.  In  other  words,  the  entropy  defect  shows  how  far  the  determinate  choice  of  the 
signal  deviates  an  information  channel  (as  a  physical  system)  from  its  thermodynamic 
equilibrium.  Thus,  the  evaluation  of  the  information  capacity  of  a  particular  channel  is 
related  to  the  problem  of  computation  of  the  above  H^ax  and  Hq  values  of  the  entropy. 

For  the  case  when  information  is  communicated  through  a  channel  with  noise, 
the  total  energy  Ej  received  by  the  receiver  from  the  channel  output  during  the  time 
interval  r  consists  of  the  energy  of  the  signal  Es  introduced  to  the  channel  input  and  an 
accumulated  noise  energy  Ej^i: 


Ej  =  Es  +  Ej,.  (2) 

This  gives  a  formal  basis  to  evaluate  the  above  maximum  entropy  values  Hmax  foi' 
both  quantum  types  of  channels,  boson  and  fermion,  by  using  the  corresponding  distri¬ 
bution  functions.  This  leads  to  the  unique  establishment  of  the  corresponding  effective 
temperatures  Tj  of  the  channels  and  allows  to  express  the  values  of  the  joint  energies 
Ej  and,  consequently,  the  maximum  entropies  Hmax  functions  of  these  effective  tem¬ 
peratures:  Ej  =  E(Tj)]  Hmax  =  H{Tj),  at  least  implicitly. 

For  the  entropy  value  Hq^  however,  the  problem  is  not  so  trivial.  For  a  boson 
channel,  the  signal  and  noise  joint  distribution  function  can  be  considered  with  no  re¬ 
strictions  as  the  sum  of  signal  and  noise  distribution  functions.  Hence,  in  the  presence 
of  a  determinate  signal  (this  means  that  the  signal  probability  equals  1)  the  total  en¬ 
ergy  transmitted  through  the  channel  during  time  r  is  the  sum  of  the  signal  and  noise 
energies,  but  the  entroiyy  flu.x  remains  unchanged  because  the  microstate  of  the  signal  is 
completely  determined  and  it  does  not  add  any  part  to  the  entropy.  The  latter  is  then 
the  entropy  of  noise  due  to  the  channel  interaction  with  an  environment  at  temperature 
To.  Hence  we  can  take  the  value  i^o,  which  we  are  interested  in  Eq.(l),  as  the  boson 
noise  entropy:  Hq  =  H^{To). 

For  the  fermion  case,  however,  introducing  a  ’’determinate”  signal  to  the  channel 
input  is  questionable.  Noise  occurring  in  the  channel  restricts  the  occupation  of  certain 
channel  degrees  of  freedom.  This  is  due  to  the  fact  that  the  same  energy  state  can  not 
be  occupied  by  more  than  one  element  in  the  case  of  a  fermion  ensemble:  fermions  obey 
the  Pauli  principle. 

As  a  result  of  this  statistical  interdependence  of  signal  and  noise,  we  can  introduce 
a  particular  signal  with,  at  the  best,  some  maximum  probability  but  less  than  1,  to  a 
fermion  channel.  Hence  it  is  not  possible  to  reduce  the  entropy  value  Hq  to  the  fermion 
noise  entropy  H^{To).  However,  as  we  are  not  able  to  coinpute  the  value  Hq  at  this 
stage  explicitly,  we  may  replace  it  by  H^{Tq)  <  Hq  in  Eq.(l).  This  allows  us  to  evaluate 
the  upper  limit  of  a  fermion  information  channel  capacity  for  an  arbitrary  signal  energy 
Es,  in  the  same  way  as  for  a  boson  channel. 

3.  Information  features  of  the  electron  and  photon  channels  and  their  com¬ 
parison 

Based  on  the  above  approacli  we  have  simidated  some  information  features  of  the  fermion 
and  boson  channels  and  compared  them  with  assumption  that  they  are  in  the  same 
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conditions  of  interaction  with  their  environment.  Fig.l  shows  some  examples  of  the 
behavior  of  the  information  capacities  for  both  types  of  channels.  More  complete  results 
of  the  study  undertaken  are  to  be  published  [4]. 


Fig.l.  Information  capacities  for 
one-dimensional  fermion  and  boson 
channels  at  Tq  =  SOOiT  as  functions 
of  the  signal  power.  Cdass  are  clas¬ 
sic  (Shannon)  capacities,  Cf  and 
Cb  are  capacities  for  fermion  and 
boson  channels  respectively.  The 
superscripts  ”oo”  and  ^Uirri’^  in¬ 
dicate  the  broad  and  finite  band 
cases  respectively.  The  finite  band 
fermion  channels  (a  case  of  the 
bandwidth  of  10^^  Hz  shown  by  bro¬ 
ken  off  sequence  of  circles  o)  have 
the  maximum  information  capaci¬ 
ties  which  correspond  to  the  max¬ 
imum  input  signal  power  allowed. 
(1  nit  =  ln2  bit) 


The  obtained  results  demonstrate  that  quantum-statistical  effects  become  signifi¬ 
cant  for  both  types  of  channels  when  the  signal  power  is  essentially  repiesented  by  its 
harmonics  with  frequencies  which  exceed  the  effective  boundary  frequency  of  the  noise. 
For  ID  channels  the  beginning  of  the  influence  of  quantum  effects  coiiesponds  to  input 
data  rate  of  about  This  is  the  fundamental  limit  for  electronic  systems. 

For  finite  band  fermion  channels  their  information  capacities  are  limited,  in  prin¬ 
ciple,  whilst  for  finite  band  boson  channels  are  not. 

3D  optical  channels  are  free  of  quantum  statistical  restrictions  up  to  input  data 
rates  of  about  /cm^5  [4]. 

In  the  region  of  quantum  effect  influence,  the  data  transmitting  and  processing  are 
no  longer  the  linear  processes  in  the  sense  of  the  independence  of  the  information  fea¬ 
tures  of  channels  on  signal  powers/energies.  This  implies  the  beginning  of  the  nonlinear 
dynamics  of  the  information  processes. 
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Abstract 

Fault  tolerance  can  be  incorporated  in  digital  optical  computers  using  a  distributed  redun¬ 
dancy  technique  called  quadding,  at  an  expense  of  quadruplicating  the  hardware,  and  fan-in  and 
fan-out  increasing  to  4. 

A  key  requirement  in  a  practical  digital  optical  computing  scheme  is  the  ability  to  recover 
from  transient  and  permanent  device  faults  and  signal  errors.  This  is  especially  critical  when 
using  emerging  technologies  such  as  optical  switch  arrays  which  are  bound  to  have  large  numbers 
of  device  errors,  and  for  optically  interconnected  systems  which  may  suffer  transient  errors  such 
as  that  due  to  dust  particles  floating  through  the  optical  beams.  Redundancy  is  the  most 
common  technique  utilized  to  endow  a  system  with  a  limited  degree  of  fault  tolerance  and 
increase  the  system  reliability  beyond  that  given  by  the  product  of  the  probabilities  of  correct 
operation  of  the  components.  However,  in  many  redundant  systems,  a  voter  is  required  to 
resolve  conflicts  between  the  redundant  components,  but  a  fault  in  the  voter  still  produces 
erroneous  outputs.  Although  multiply  redundant  voters  can  be  incorporated,  a  mechanism 
must  be  included  that  eliminates  and  replaces  faulty  elements  from  the  circuits,  or  else  errors 
can  propagate.  These  techniques  can  become  quite  complex  and  may  be  inappropriate  for 
optical  implementation.  Another  approach  is  to  distribute  the  voter  throughout  the  circuit 
using  the  technique  of  quadded  logic. [1-3]  In  quadded  logic,  4  copies  of  the  circuit  are  produced, 
then  interconnected  in  a  permuted  fashion  that  allows  isolated  errors  within  a  quadded  block 
of  elements  to  be  detected  and  corrected  within  the  next  few  layers.  This  is  often  considered 
an  expensive  approach  to  fault  tolerance,  since  it  multiplies  the  hardware  by  a  factor  of  4,  and 
doubles  the  fan-out  and  fan-in  of  the  elements.  However,  the  optical  implementation  of  quadded 
logic  in  a  regularly  interconnected  system  has  some  attractive  features  as  shown  in  Figures  1 
and  2.  The  depth  of  the  circuit  is  not  increased,  just  the  width,  so  no  additional  delay  or 
speed  penalties  are  imposed.  The  interconnections  between  the  quadded  circuits  are  reasonably 
regular  and  appear  to  be  amenable  to  optical  implementations.  These  interconnection  topologies 
are  reminiscent  of  optical  crossovers,  and  might  be  implemented  with  a  similar  technique.  One 
possibihty  is  to  interleave  the  original  circuits  on  rows  separated  by  4,  and  interleave  the  quadded 
duphcates  at  the  intervening  positions.  The  same  basic  architecture  of  shuffles  or  crossovers 
within  the  rows  can  be  performed,  and  holographic  interconnections  within  a  quadded  set  of  4 
rows  might  not  overly  increase  the  system  complexity. 

The  operation  of  a  quadded  system  based  on  a  regular  logical  structure  is  illustrated  in 
Figure  1.  Device  errors  are  indicated  as  signals  in  boldface  type.  For  example  x  =  1  should 
be  represented  at  the  input  with  four  I’s,  however  the  last  input  is  in  error.  The  quadded 
interconnects  in  the  first  layer  mix  the  signals  from  the  first  two  and  last  two  redundant  inputs, 
represented  as  (12,34).  This  corrects  the  NOR  gate  subcritical  error  1  ^  0  in  one  step.  On 
the  other  hand,  the  input  error  for  x  is  a  critical  0^1  NOR  error,  which  is  converted  by  the 
first  layer  quadded  interconnect  to  two  subcritical  errors,  shown  in  italics.  As  long  as  the  next 
layer  of  quadded  wiring  mixes  the  correct  and  erroneous  signals,  these  subcritical  errors  will  be 
corrected  in  the  next  layer,  and  this  is  accomplished  by  mixing  the  odd  redundant  signals  and 
the  even  redundant  signals,  represented  as  (13,24).  Similarly,  errors  introduced  throughout  the 
logical  structure  can  be  corrected,  unless  they  appear  too  close  to  another  error.  As  a  result 
error  free  outputs  are  produced. 
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The  regular,  layered  structure  of  the  conventional  digital  optical  computer  interconnects  is 
eminently  suited  for  quadding  without  the  complexities  encountered  in  random  logic.  In  addi¬ 
tion,  the  quadded  interconnects  take  advantage  of  the  capabilities  of  the  optical  interconnects 
to  accommodate  complex  wiring  topologies.  Device  defects  often  occur  in  clusters  and  this  can 
nullify  the  benefits  of  local  quadding  as  in  Figure  1  since  two  critical  errors  within  a  quad  can 
be  uncorrectable.  One  powerful  feature  of  optical  quadding  is  the  ability  to  avoid  cluster  defects 
through  the  use  of  the  long  range  capabilities  of  optical  interconnections.  As  an  example  Figure 
4  shows  a  two  shuffle  interconnection  and  its  long  range  quadded  equivalent.  Clustered  defects 
would  only  affect  one  of  the  four  members  of  each  quad,  allowing  successful  error  correction. 

The  isomorphism  between  the  (12,34)  and  (14,23)  quadding  patterns  with  crossover  inter¬ 
connects  should  allow  efficient  implementation  of  quadding  in  crossover  networks,  and  in  these 
regular  structures  only  two  alternating  quadded  patterns  are  needed.  Alternatively,  by  placing 
the  redundant  quadded  logic  devices  in  a  sparse  topology  then  a  shift-invariant  interconnection 
can  be  utilized.  The  example  shown  in  Figure  3  shows  that  the  (12,34)  and  (13,42)  quadding 
patterns  can  be  implemented  with  a  fanout  of  3,  light  efficiency  of  66%  and  device  packing 
densities  of  66%  and  50%  respectively,  while  the  quadding  pattern  (14,23)  requires  a  fanout  of 
5,  and  achieves  a  light  efficiency  of  only  40%  and  a  packing  density  of  only  44%.  Quadding  as  2 
by  2  blocks  allows  2-D  shift  invariant  interconnections  with  44-50%  packing  density.  For  these 
shift  invariant  topologies  the  (12,34)  and  (13,42)  quadding  patterns  should  be  alternated.  A 
space  invariant  implementation  of  a  programmable  two  shuffle  with  quadding  for  error  correc¬ 
tion  implemented  with  a  holographic  interconnection  is  shown  schematically  in  Figure  5.  The 
two  shuffle  is  programmed  by  splitting  the  two  parts  of  the  shuffle  vertically  where  they  can  be 
blocked  or  passed  under  the  control  of  a  mask  or  SLM.  These  two  halves  are  then  combined, 
for  example  using  walkoff  in  an  off-axis  anisotropic  plate.  A  2:1  anamorphic  magnification  is 
required  in  x  to  replicate  the  input  NOR  gate  topology,  which  can  be  accomplished  with  either 
cylindrical  optics  or  prism  beam  expanders.  The  output  of  the  odd  layers  of  (12,34)  quadded 
two-shuffle  logic  must  be  quadded  with  a  different  pattern  in  the  intervening  even  layers,  and 
these  quadding  patterns  must  alternate.  This  can  be  accomplished  with  a  different  shift  invariant 
hologram  that  implements  the  programmable  two  shuffle  and  the  (13,24)  quadding  interconnect 
in  the  even  layers.  This  output  can  be  fed  back  to  the  original  input  with  a  1  row  vertical  shift 
to  implement  a  many  layered  fault-tolerant  logical  system  with  programmable  space- variant 
operations,  yet  utilizing  space  invariant  interconnections. 

The  error  performance  of  a  simple  multilayered  regularly-interconnected  NOR  gate  based 
logical  system  is  compared  with  its  quadded  equivalent  in  Figure  6,  as  a  function  of  input 
probability  of  error  and  additive  detector  noise  at  the  threshold  logic  NOR  gate  inputs.  The 
original  circuit  has  an  output  error  probability  that  increases  linearly  with  input  error  probability 
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Figure  2:  One  possible  3-D  topology  of  quadded  regular  in-  Figure  3:  Sparse  device  layout 
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Figure  4;  Two-shulRe  interconnect  and  quadded  equivalent  using  long  range  interconnects  for 
quadding  to  avoid  clustered  defects. 


and  detector  noise  worsens  the  situation.  This  indicates  the  lack  of  error  correction  in  the 
original  circuit.  The  quadded  version  of  this  circuit  demonstrates  an  output  error  probability 
that  increases  slower  and  quadratically  with  input  error  probabihty,  and  appears  to  have  a 
large  region  of  completely  error  free  performance  as  desired.  This  data  is  for  the  4  level  circuit 
shown  in  Figures  1  and  4,  and  deeper  circuits  should  have  even  more  tolerance  to  errors.  For 
low  probability  of  device  errors,  the  general  tendency  is  for  the  system  to  achieve  an  error 
performance  approaching  the  square  of  the  device  error  performance,  so  a  system  of  10^  devices 
with  individual  probability  of  error  of  10“^°  should  lead  to  a  system  probability  of  error  of  10“^^. 

This  approach  to  redundant  fault  tolerant  digital  optical  computing  may  allow  the  utilization 
of  devices  with  increased  probabilities  of  failure  without  an  unacceptably  large  system  reliability 
penalty.  Such  an  approach  may  be  required  in  order  to  make  practical  and  reliable  digital  optical 
computers  out  of  simple  arrays  of  unreliable  switching  elements. 

The  author  acknowledges  support  of  the  NSF  young  investigator  program  ECS  9258088. 
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On  Parallel  Algorithms  for  Optical  Image  Processors 
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Abstract  Fine  grain  parallel  optoelectronic  processors  for  dedicated  tasks  may 
deserve  further  consideration.  The  main  case  is  machines  for  "real  time"  vision  ; 
stochastic  algorithms  working  on  images  considered  as  Markov  fields  are  one 
possible  approach  to  tackle  real  images. 


1.  Introduction 

The  processing  of  real  images  is  one  area  where  optical  computing  may  come  up  with 
competitive  solutions.  The  basic  reason,  as  usual,  is  the  number  of  interconnects  that  are 
needed  before  any  sensible  processing  task  on  an  image  is  completed ;  the  number  of 
interconnects  required  is  particularly  large  in  parallel  implementations,  which  is  where  free 
space  optics  is  attractive.  The  word  interconnect  should  be  understood  in  a  broad  sense, 
including  data  input  and  output  and  the  provision  of  various  control  signals  to  the  processing 
elements  as  well  as  exchange  of  information  among  the  processors  themselves.  We  are 
interested  in  "optical  scale  parallelism",  where  the  processor  consists  in  a  number  of 
processing  elements  (PE's)  equal  to  the  number  of  pixels  in  the  image,  typically  at  least  10^  ; 
this  puts  the  heaviest  weight  on  interconnects  but  alleviates  considerably  programme  and 
data  transfer  control.  For  easy  image  format  input,  all  PE's  should  fit  together  on  a  chip  a 
few  square  centimetres  on  a  side.  But  technology  will  allow  to  integrate  only  fairly  weak 
PE's  on  such  a  small  area.  We  are  then  faced  with  a  double  challenge  :  devise  optoelectronic 
architectures  that  make  the  best  use  of  optical  interconnects  to  maximise  the  computing 
power  and  find  algorithms  that  can  use  it  for  meaningful  image  processing  tasks.  Starting 
with  the  well  known  architectures  of  optical  correlators  (i.e.  convolvers)  and  optical  cellular 
automata,  we  shall  devote  the  major  part  of  this  work  to  the  application  of  optical  computing 
to  optimisation  problems.  When  approached  using  parallel  versions  of  the  class  of  stochastic 
algorithms  known  as  simulated  annealing,  these  could  in  some  reasonable  future  benefit 
from  a  suitable  combination  of  optical  and  microelectronic  functions. 


2.  From  optical  convolutions  to  optical  cellular  automata 

The  simplest  and  most  famous  application  of  the  above  concept  is  the  optical  convolution, 
where  the  PE  reduce  to  photodetectors  and  where  weighted  optical  interconnects  that  define 
the  impulse  response  do  all  the  processing.  The  main  application  is  pattern  recognition.  The 
double  challenge  then  takes  on  the  following  form  ;  can  convolution  be  helpful  in  real 
pattern  recognition  problems  and  if  yes,  can  optics  implement  such  convolutions  ?  Progress 
on  filter  reprogrammability,  adaptivity  to  the  input  signal  and  invariances  has  been  fast  in 
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the  last  few  years  [1]  and  will  be  reported  in  several  papers  in  this  conferences.  Also,  nice 
implementations  of  rugged  and  compact  optical  convolvers  have  been  published  [2]. 

Convolution  nevertheless  shows  only  limited  generality  and  it  is  important  to  seek 
broader  classes  of  our  image  processors.  The  next  simple  case  is  cellular  automata  [3],  that 
have  been  investigated  in  some  detail  for  some  years  under  various  names,  including 
symbolic  substitution  and  mathematical  morphology  [4].  Their  operation  cycle  consists  in 
the  combination  of  one  convolution  and  one  point  nonlinearity.  The  main  role  of  optics  here 
is  to  implement  the  convolution  part,  while  optoelectronic  or  nonlinear  optical  devices 
located  at  every  pixel  respond  nonli nearly  to  its  result. 

The  invited  communication  by  Casasent  in  this  conference  develops  mathematical 
morphology  applications  for  a  number  of  image  processing  tasks. 


3.  Optimisation  problems  in  image  processing 

One  further  step  is  to  introduce  optimisation  problems  on  images  into  the  realm  of  optical 
computing.  In  an  optimisation  problem,  an  energy  function  E(x)  is  introduced  as  a  measure 
of  the  departure  of  an  image  x  from  an  ideal  goal  to  be  reached  by  the  processing.  The 
definition  of  function  E  incorporates  all  relevant  knowledge,  i.e.  the  input  data  but  also,  for 
example,  the  sources  of  degradation  to  be  removed,  a  priori  information  on  the  class  of 
object,  and  the  features  of  interest.  The  processing  minimises  the  energy  with  respect  to  X- 
Typical  applications  include  edge,  texture  and  motion  detection,  as  well  as  higher  level  tasks 
such  as  pattern  classification  [5]. 

Previous  algorithmic  work,  notably  by  Gcman  et  al.  [6],  has  demonstrated  energy 
functions  that  can  detect,  for  example,  texture,  edge  or  motion  in  fairly  difficult  situations. 

However,  the  computational  load  is  usually  extremely  heavy  because  a  small  change 
in  the  image  can  generate  a  large  change  in  the  energy  -  the  "energy  landscape"  is  said  to  be 
wild  -  so  that  secondary  minima  will  prevent  deterministic  descent  algorithms  from  reaching 
the  desired  minimum  or  even  an  acceptable  suboptimal  solution.  With  most  energy 
functions,  the  problem  is  non-polynomial  in  time,  i.e.  the  optimal  image  x^^  can  be  found 
only  through  exhaustive  search  in  the  space  of  all  possible  images,  whose  size  increases 
exponentially  with  the  number  of  pixels.  As  a  consequence,  it  is  impossible  in  practice  to 
find  the  absolute  minimum. 

But  the  situation  is  even  worse  than  that :  efficient  suboptimal  procedures  are 
themselves  hardly  practicable.  Let  us  take  the  example  of  simulated  annealing,  that  was 
advocated  by  Geman  in  the  work  cited  above  and  will  be  developed  in  the  next  sections. 
Finding  a  good  suboptimal  solution  will  typically  require  to  loop  through  a  procedure  of 
energy  updating  a  quite  large  number  of  times,  typically  of  the  order  of  a  few  million  times 
the  number  of  pixels.  This  is  still  impractical  unless  some  way  can  be  found  to  implement 
them  in  parallel :  as  we  shall  illustrate  now,  their  parallel  implementation  is  where,  in  our 
opinion,  optics  may  have  a  new  role  to  play.  In  conclusion  to  this  discussion,  among  the 
powerful  algorithms  that  have  been  found  to  progress  in  the  processing  of  real  images,  one 
subset  may  be  open  to  "optical  scale  parallelism"  and  this  is  what  we  are  investigating. 


4.  Simulated  annealing  :  one  example 

In  the  following,  for  the  sake  of  specificity,  we  shall  discuss  one  particular  example  of  an 
optimisation  technique.  While  it  may  not  be  completely  general,  it  has  the  advantage  of 
simplicity,  without  being  trivial,  and  it  allows  to  introduce  all  the  relations  that  we  see 
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between  parallel  stochastic  algorithms  and  optical  computing.  This  example  addresses  the 
so-called  "weak  string  problem"  [7],  which  is  the  one-dimensional  version  of  image 
restoration  preserving  discontinuities.  In  this  problem,  starting  from  the  observed,  real 
valued,  noisy  image  x  as  input  data,  two  images  are  estimated :  a  real-valued  smoothed 

image  x  and  a  binary  border  image  b,  equal  to  1  where  there  is  a  border  and  to  0 
everywhere  else.  Figure  1  shows  the  spatial  structure  of  the  two  estimated  images. 

^i-1  ^i  ^i+l  ^  i+2 

•  •  • 

^  i-1  ^i  ^i+l  ^1+2 
Figure  1.  Data  structure  for  the  "weak  string  problem". 


The  energy  is  heuristically  defined  as  : 

E[tixj=  X  +  + 

all  pixels  i  ^  t  ^ 

The  first  term  enforces  similarity  between  the  estimated  image  and  the  input  data,  the 
second  term  is  a  smoothness  constraint  applied  to  adjacent  pixels  that  have  no  border  pixel 
between  each  other,  and  the  third  term  is  a  border  penalty  to  avoid  uncontrolled  growth  of 
the  borders.  X  and  p  are  adjustable  parameters. 

In  this  case,  simulated  annealing  consists  in  iterating  the  following  steps  : 

•  1  -  select  one  border  pixel  i  for  "updating", 

•  2  -  evaluate  the  energy  change  5E  when  b-  is  changed  from  0  to  1, 

•  3  -  set  b-=l  with  probability  such  as 

p(SE)  =  - - (2) 

^  ^  l  +  exp(^) 

where  T  is  a  parameter,  called  temperature  by  analogy  with  statistical  physics, 

•  iterate  the  updating  (steps  1-3),  slowly  decreasing  T  according  to  some  suitable  "annealing 
schedule"  (see  the  reference  by  Geman  cited  above). 

Notes  : 

•  the  fact  that  this  procedure  is  a  good  heuristic  procedure  for  estimation  problems  is 
documented  in  the  literature  ;  we  shall  be  content  here  with  the  comment  that  the  initial  high 
temperature  allow  to  accept  basically  any  energy  change  with  a  significant  probability  and 
therefore  avoid  being  trapped  in  local  minima  of  the  energy  landscape,  while  the  final  low 
temperature  are  useful  for  finding  local  minima  in  a  neighbourhood  where  the  energy  is 
already  quite  low  ; 

•  in  other  cases,  the  change  of  x^  to  jc.  +  8x  should  also  be  made  in  a  stochastic  mode  but 
this  is  not  so  useful  here  because  the  energy  function  is  everywhere  concave  in  the  variable 
I  ;  it  is  therefore  relatively  simple  to  find  its  minimum  ;  for  a  more  complete  description  of 
this  intricate  technical  point,  see  reference  8. 

As  already  mentioned,  we  are  interested  in  the  parallel  implementation  of  simulated 
annealing.  In  the  present  case,  the  energy  change  related  to  one  border  pixel  i  depends  on  b-. 


but  not  on  the  bj,  Parallel  updating  of  all  border  pixels  is  therefore  possible.  More 

generally,  parallelism  of  a  degree  linear  in  the  number  of  pixels  is  legal  if  the  energy  change 
at  one  point  depends  only  of  a  finite  neighbourhood  of  this  point.  Asynchronous  operation  is 
an  appealing  but  more  complex  issue  that  we  shall  not  approach  here. 
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5.  Optical  computing  solutions  for  parallel  simulated  annealing 

In  this  context,  optical  computing  can  provide  three  functions  ;  convolution  for  the 
calculation  of  5E,  production  of  the  random  numbers  requested  to  implement  the  stochastic 
decision  part  of  simulated  annealing,  and  optoelectronic  thresholding. 

5.1  -  Optical  convolution  : 

Firstly,  the  calculation  of  energy  variations  5E  often  implies  convolutions,  and  we  are  back 
to  the  first  section.  Specifically,  let  us  give  two  examples  : 

•  from  equation  1,  it  is  straightforward  to  derive 

=  (3) 

which  may  be  implemented  in  part  using  a  coherent  optical  convolution  on  the  x  image 
followed  by  the  usual  quadratic  detection  of  optics.  In  this  simple  example,  the  convolution 
reduces  to  a  straightforward  difference  between  the  grey  value  of  two  nearest  neighbour 
pixels.  Quite  likely,  it  is  much  easier  to  implement  in  hardwired  microelectronics  than  using 
optical  convolution.  The  same  is  not  true,  however,  for  more  complex  energy  functions  that 
involve  larger  neighbourhoods  —  texture  detection  is  one  example. 

•  If  the  X  dependency  of  the  energy  function  is  quadratic  but  not  everywhere  concave,  then, 
as  already  mentioned,  energy  minimisation  with  respect  to  x  is  a  difficult  task  and  simulated 
annealing  over  x  may  be  useful.  Differentiation  of  a  quadratic  form  will  obviously  always 
lead  to  a  linear  form,  i.e.  a  matrix  product,  that  in  turns  simplifies  to  a  convolution  if  the 
energy  model  assumes  stationarity  of  the  image  properties. 

5.2.  Parallel  generation  of  random  numbers  : 

The  parallel  generation  of  a  large  amount  of  random  numbers  of  a  good  statistical  quality  is 
a  problem.  The  requirement  here  is  to  provide  independent  random  numbers  with  the 
appropriate  statistics  to  all  processing  element,  i.e.  every  pixel,  at  every  energy  update 
operation,  which  means  around  10  random  numbers  per  second  on  a  microelectronic  chip. 
We  have  suggested  to  use  a  physical  random  number  generator  for  this  purpose  and 
investigated  the  use  of  laser  speckle  projected  onto  an  array  of  photodiodes.  The  electronic 
part  of  our  parallel  processor  therefore  consists  of  a  "smart  pixels"  chip  with  at  least  one 
photodetector  per  image  pixel  being  processed.  Of  course,  with  suitable  sequencing,  the 
same  photodetector  may  be  used  for  image  input,  for  the  input  of  5E,  and  for  the  speckle 
input. 

Specifically,  we  have  shown  that  speckle  statistics  can  be  easily  moulded  into  exactly 
the  required  form  of  probability  law,  and  that  the  simulated  temperature  can  be  controlled 
directly  by  the  average  speckle  brightness,  i.e.  the  laser  power  [9].  Let  us  just  summarise  the 
basic  principle  in  a  few  lines. 

A  fully  developed  speckle  integrated  over  the  area  of  a  photodiode  obeys  known 
statistics  that  depend  on  the  number  of  speckle  grain  over  the  detector  area  [10].  We  have 
experimentally  demonstrated  the  possibility  of  producing  10^^  random  numbers  per  second 
by  projecting  speckle  from  a  suitably  moving  diffuser  onto  a  1  cm^  silicon  photodetector 
array  11.  Figure  2  illustrates  how  to  obtain  the  probability  law  required  by  equation  (2) :  two 
speckle  photodetectors  are  used  instead  of  one.  An  analogue  adder  combines  the  signal  from 
the  first  photodetector  with  the  energy  difference  5E.  The  result  is  sent  to  the  positive  input 
of  a  thresholding  gate,  while  its  negative  input  receives  the  second  speckle  photodetector 
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signal.  Analysis  shows  that,  with  suitable  speckle  parameters,  the  resulting  probability  that 
the  positive  input  exceeds  the  negative  input  quite  well  approximates  equation  (2). 
Temperature  is  emulated  by  the  average  speckle  intensity. 


Figure  2.  Generating  the  probability  law  of  equation  2  with  two  speckle  samples. 

5.5.  Optoelectronic  thresholding  : 

Finally,  novel  optoelectronic  or  nonlinear  optical  arrays  such  as  SEEDs  or  pnpn 
photo  thyristors  may  be  used  to  make  the  required  decision.  For  example,  as  shown  in  the 
contribution  by  Fremont  et  al.  in  this  conference  [12],  the  role  of  the  thresholding  gate  of 
figure  2  can  be  played  by  a  differential  pair  of  optical  photothyristors.  The  output,  i.e.  border 
pixel  bi,  is  then  available  in  the  form  of  an  optical  signal  for  some  further  processing  step  or 
for  the  output  of  results. 

More  generally,  integrated  circuits  with  a  complexity  depending  on  the  particular 
energy  function  will  be  required  in  association  with  optics  ;  globally,  only  the  combination 
of  electronic  and  optical  functions  can  open  the  way  to  compact,  massively  parallel 
integration  of  image  processors  for  such  algorithms. 


Conclusion 

The  favourite  operation  of  analogue  optical  processing,  convolution,  may  provide  a  solution 
for  a  certain  number  of  image  processing  problems.  But  we  believe  that  it  could  well  be 
combined  with  other  readily  available  optical  functions  and  with  integrated  circuit 
microtechnology  into  video-real-time  systems  for  a  significantly  wider  class  of  vision 
problems,  in  particular  optimisation  problems  that  can  be  handled  by  parallel  stochastic 
algorithms. 

We  acknowledge  the  contributions  of  E.  Belhaire,  F.  Devos,  P.  Garda,  L.  Gamero, 
G.  Fremont  D.  Prevost  and  J.C.  Rodier  to  the  research  leading  to  this  work. 
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Duality  between  algorithms  and  optical 
implementations  :  actual  examples 


J.  P.  Heer,  P.  Pellat-Finet 

ENST  de  Bretagne,  BP  832,  29285  Brest,  France 

Abstract.  Considerations  on  number  representations  are  introduced  to 
make  use  of  the  advantage  of  optics  in  computer  architectures.  Specific  opti¬ 
cal  systems  are  proposed  for  CORDIC  algorithms  and  for  residue  arithmetic 
with  applications  to  discrete  transforms. 


1.  Introduction 

The  recent  developments  of  bistable  and  fast  switching  optoelectronic  components  (com¬ 
bination  of  laser  diodes  and  photodiodes,  S-SEED)  do  not  necessarily  imply  that  optics 
could  be  an  alternative  technology  for  high  speed  parallel  computers.  There  is  indeed  a 
need  for  a  duality  between  algorithms  and  optical  implementations  and  this  duality  is 
not  generally  considered  by  scientists  working  in  optical  computing. 

As  an  example  of  this  duality,  we  will  focus  on  some  number  representations  and 
their  corresponding  optical  processing  architectures.  We  will  only  distinguish  two  well 
known  number  representations:  the  position  number  and  the  residue  number  representa¬ 
tions.  More  precisely  we  will  show  that  the  choice  of  appropriate  number  representations 
and  optical  architectures  leads  to  actual  implementations  of  more  general  algorithms  for 
fast  parallel  computing  which  should  reinforce  fast  component  capabilities. 

2.  An  optical  implementation  for  CORDIC  algorithms 

2.1.  The  modified  signed  digit  representation  and  its  optical  implementation 

The  binary  number  representation  is  a  position  representation,  widely  used  in  electronic 
computers.  Nevertheless  it  appears  to  be  imperfectly  suited  to  optical  implementations 
because  of  its  sequential  feature.  An  alternative  is  the  signed  digit  number  representation, 
introduced  by  Avizienis  [1]:  it  allows  carry  free  parallel  processing  with  a  number  of  steps 
independent  of  the  number  of  digits.  Attention  has  been  paid  to  the  modified  signed 
digit  (MSD)  representation  [2]  and  an  optoelectronic  MSB  adder  has  been  built  [3].  We 
will  see  in  the  next  section  how  this  adder  can  be  inserted  in  a  larger  architecture  in 
order  to  perform  CORDIC  algorithms.  We  only  emphasize  at  this  stage  that  the  adder 
provides  an  example  of  the  necessary  adaptation  (duality)  between  an  algorithm  and  an 
optical  implementation,  taking  advantage  of  parallel  processing  capabilities  of  optics. 
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2.2.  Application  to  CORDlC-hke  algorithms 

CORDIC  (coordinate  Rotations  on  a  Digital  Computer)  algorithms  were  introduced  by 
Voider  [4]  to  compute  trigonometric  functions  by  performing  only  additions  and  shifts. 
According  to  Walther  [5,  6]  an  extended  algorithm  is  as  follows.  We  introduce  a  decreas¬ 
ing  sequence  (cn)  (n  is  a  positive  integer)  and  a  number  m  which  depend  on  the  function 
to  be  computed  (examples  will  be  given  below).  Consider  the  sequences  (x^),  (yn)  and 
(zn)  such  that  : 

=  Xn  -  ;  yn-f-l  =  yn  +  ]  Zn  +  1  =  “  dnCn  (l) 

where  (d^)  is  a  sequence  depending  on  sequences  (y^)  and  (z^)  and  on  the  function  to 
be  computed. 

It  can  be  shown  [5,6]  that  for  Cn  =  tan"^  2“^^,  m  =  1,  a:o  =  [Ilo^ll  +  2  ^’^)]  ^  , 

yo  =  0,  and  d^  =  sign^^,  the  sequence  (.Tn)  tends  to  cos  Zq  and  the  sequence  (y^)  to  sin  zq 
for  infinite  n.  For  =  2"^,  m  =  0,  =  signz^  and  yo  =  0,  the  limit  of  y„  is  xqZo. 
For  e„  =  2“^,  m  =  0,  d,,  =  -sign^^  and  Zq  =  0,  the  limit  of  is  Po/xq.  Other  initial 
values  lead  to  compute  other  functions  such  as  tan“^  and  hyperbolic  functions.  Similar 
(CORDIC-like)  algorithms  can  be  used  to  compute  the  exponential  function  [7].  Hence 
the  algorithm  allows  the  computation  of  classical  elementary  functions,  multiplication 
and  division. 

In  eq.(l)  we  perform  multiplications  of  and  y^  by  2“'"  (because  |md^|  =  |d^|  =  1), 
that  is,  if  the  binary  representation  is  used,  shifts  on  n  digits.  Moreover,  it  can  be  shown 
that  CORDIC-like  algorithms  are  compatible  with  the  MSD  representation  of  numbers 
[7].  This  means  that  additions  in  eq.(l)  can  be  achieved  carry  free  with  a  parallel 
processing.  In  this  way  the  above  mentionned  adder  can  be  used  and  only  needs  to  be 
completed  with  a  digit  shifter.  An  optical  solution  to  the  digit  shifter  has  been  proposed 
and  built  and  a  general  optical  architecture  for  CORDIC-like  algorithms  can  be  deduced 
[7].  The  adaptation  of  an  optical  implementation  to  CORDIC  algorithms  is  an  example 
of  the  necessary  duality  between  algorithms  and  implementations. 


3.  An  optical  implementation  for  discrete  transforms 
2.1.  Optical  processor  for  residue  arithmetic 

We  will  describe  a  residue  arithmetic  optical  processor  that  we  recently  built  [7].  Residue 
arithmetic  will  be  assumed  to  be  known  [8,9]:  it  reduces  the  operation  complexity  by 
dividing  the  number  representation  into  smaller  independent  integers,  allowing  parallel 
processing. 

Although  the  actual  processor  deals  with  numbers  modulo  15,  we  will  explain  its 
principle  with  5  as  modulus.  The  processor  whose  design  is  given  by  fig.l  has  two  entries 
which  represent  the  numbers  to  be  processed.  An  entry  is  a  luminous  source,  called  an 
input  source^  and  made  up  of  five  LED  rods.  Each  rod  is  assigned  to  a  value  in  the  set 
{0,1, 2, 3, 4}  and  only  one  rod  among  the  five  is  activated  according  to  the  value  to  be 
represented.  One  input  source  is  made  up  of  horizontal  rods  and  the  other  of  vertical 
ones.  The  shape  of  the  area  spanned  by  the  five  rods  is  a  square. 

A  beam  splitter  mixes  light  beams  coming  from  the  input  sources,  the  plane  of 
which  are  symmetrical  to  each  other  with  respect  to  the  beam  splitter  separating  area. 


19 


A  spherical  lens  focuses  the  input  source  images  on  an  output  plane  in  which  there 
is  a  square  array  of  25  detectors.  Among  the  25  detectors,  16  do  not  receive  any  light, 
8  receive  light  from  one  rod  and  only  one  receives  light  from  two  rods  which  belong  to 
the  two  input  sources.  The  latter  detector  indicates  on  which  input  values  the  operation 
has  to  be  performed.  In  order  to  distinguish  the  response  of  this  detector  among  the 
others,  a  thresholding  has  to  be  accomplished  and  since  it  is  a  non-linear  processing,  this 
is  electronically  achieved.  Since  only  three  illumination  levels  have  to  be  distinguished 
the  system  is  highly  insensitive  to  noise. 

Behind  the  detector  array,  there  is  another  light  source,  called  output  source,  which 
is  designed  like  an  input  source.  Each  of  the  25  detectors  is  linked  to  a  rod  of  an  output 
source,  the  choice  of  the  rod  depending  on  the  operation  to  be  performed.  In  practice  the 
electric  current  coming  from  a  detector  has  to  be  amplified  to  be  high  enough  to  control 
a  LED  rod.  The  thresholding  and  amplification  processing  and  link  between  detectors 
and  the  output  source  are  included  in  what  is  denoted  by  electronic  link  on  fig.l  (where 
only  one  output  is  designed).  An  example  is  given  on  fig.l  and  deals  with  l-f-2=3.  The 
inversions  due  to  the  lens  and  the  beam  splitter  have  not  been  taken  into  account.  Shaded 
areas  indicate  activated  LEDs  and  detectors.  Only  the  activated  electronic  link  has  been 
designed. 

Such  a  system  does  not  carry  out  any  calculation  but  only  finds  the  result  in 
a  memory:  the  input  source  and  the  beam  splitter  help  finding  the  address  and  the 
electronic  link  assigns  the  result  value.  It  should  be  clear  that  the  system  would  run  at 
the  same  speed  whatever  the  modulus.  The  built  processor  works  at  a  20  MHZ  clock 
rate.  A  more  complete  system  should  be  made  up  of  various  processors  working  with 
different  moduli.  Notice  that  two  outputs  can  be  obtained  with  a  unique  beam  splitter 
and  then  two  operations  can  be  simultaneously  performed. 

3.£.  Quadratic  residue  arithmetic 

Quadratic  residue  arithmetic  is  useful  to  handle  complex  numbers  [8].  The  equation 
=  —  1  mod  m.  admits  a  solution,  say  i,  if  m  is  equal  to  2  or  is  a  prime  number  of  the 
form  4ri  -|-  1  where  n  is  a  positive  integer.  Let  x  =  x,.  +  ixi  be  a  complex  number  where 

Xr  and  Xi  are  integers.  Let  xq  =  a  mod  m  and  =  6  mod  m.  Then  x  is  represented 

by  (a  +  ib,a  —  ib),  that  is,  by  two  integers  modulo  m  on  which  algebraic  operations  are 
performed  separately.  Notice  that  a  and  b  can  easily  be  deduced  from  a  +  ib  and  a,  —  ib. 

3.3.  Optical  architecture  for  discrete  transforms 

We  explain  now  how  the  optical  residue  processor  can  be  included  in  an  architecture  for 
discrete  transforms.  We  will  choose  the  fast  Fourier  transform  with  the  usual  butterfly 
architecture  on  N  points.  We  consider  a  residue  system  with  M  moduli  ruj  (1  <  j  <  M) 
and  let  [xjj  denote  the  residue  of  x  modulo  nij.  If  to  —  exp(—2z7r/7V),  the  discrete  Fourier 
transform  of  x(k)  [k  =  I, . . . ,  N)  \s  x{n)  =  Ylk  r{k)w^^  and  split  in  M  transforms  modulo 
JTlj  defined  by  [10]: 

[■■!■('*)];■  =  Y,  .  (2) 

k  ^ 

Since  to  is  a  complex  number,  eq.(2)  must  be  written  according  to  the  quadratic  residue 
system.  The  optical  implementation  of  an  elementary  cell  of  the  butterfly  structure  is 
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given  on  fig.2.  Only  the  beam  splitter  is  designed:  it  represents  indeed  an  optical  proces¬ 
sor  as  in  paragraph  3.1.  Additions  and  multiplications  by  are  performed  according 
to  quadratic  residue  arithmetic.  A  general  butterfly  architecture  can  be  deduced  [7]. 


Input  sources 


Fig.2 


Fig.l 
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Abstract.  The  Optical  Cellular  Logic  Image  Processor,  O-CLIP,  was  devel¬ 
oped  to  demonstrate  the  functional  capabilities  of  optically  interconnected 
logic  arrays.  The  0-CLIP  can  in  principle  implement  any  binary  image  al¬ 
gebra  tasks.  As  an  example,  we  present  here,  a  new  shortest  path  hunt 
algorithm  that  precisely  maps  onto  the  architecture  of  the  constructed  0- 
CLIP.  The  main  phases  of  the  routine  are  explained.  The  efficiency  of  the 
algorithm  is  compared  to  its  serial  implementation. 


1.  Introduction. 

In  1961,  Lee  developed  a  maze  algorithm  capable  of  finding  the  shortest  path(s)  between 
two  given  points  within  a  rectangular  grid  filled  with  obstacles  [1].  For  this  algorithm, 
the  nearest  neighbours  of  the  starting  point  (marked  S  in  figure  la)  are  labelled.  The 
labelling  is  repeated  in  increasing  order  to  the  next  nearest  neighbours  until  the  finish 
point  (marked  F)  is  reached  (figure  lb).  The  shortest  path  is  then  found  by  tracing  back 
from  the  finish  point  the  decreasing  sequence  of  the  labels  generated  by  the  expansion 
process.  This  algorithm,  also  called  the  shortest  path  hunt  algorithm,  was  later  derived 
for  single  instruction,  multiple  data  (SIMD)  computers  within  the  image  logic  algebra 
(ILA)  [2,  3].  The  optical  implementation  necessitates  however,  at  certain  stages,  a  fan¬ 
out  corresponding  to  the  number  of  elements  of  the  grid.  The  new  version  presented  here 
utilizes  a  four-nearest  neighbours  interconnect.  All  the  information  as  to  the  labelling  of 
the  elements  can  be  held  on  just  three  labels  planes.  Three  different  labels  are  in  fact 
the  minimum  number  needed  to  distinguish  the  character  of  a  non-monotonic  number 
sequence  (figure  Ic). 

^  E-mail  :  marc@phy.hw.ac.uk 


Figure  la-lc.  Lee  and  modified  Lee  routines. 


Figure  2.  Schematic  of  the  O-CLIP  layout. 


Row  or  ojlumn  tengti  (#  pixels) 

Figure  3.  Parallel  and  serial  implementation  of  the  Lee  routine.  Benchmark. 
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2.  The  O-CLIP  implementation. 

The  algorithm  is  to  be  utilized  for  a  four  nearest-neighbours  interconnect  system  such  as 
in  the  0-CLIP  [4].  The  different  processing  planes  are  Symmetric  Self- Electrooptic-Effect 
Device  (S-SEED)  arrays.  The  devices  can  be  programmed  as  NAND/NOR  logic  gates  [5]. 
They  can  also  be  used  as  inverting  buffers.  A  schematic  of  a  possible  layout  is  shown  in 
figure  2.  Besides  the  three  different  S-SEED  arrays  which  store  the  label  sequence,  three 
cache  memories  are  used  to  perform  the  intermediate  processing  steps.  Label  planes  and 
cache  memories  form  individual  optical  loops  within  the  0-CLIP  architecture.  The  loops 
ensure  the  optical  feedback  of  a  previously  processed  array.  This  array  is  compared  with 
data  from  the  input  spatial  light  modulator  (SLM)  or  with  the  information  from  another 
loop.  The  result  of  this  comparison  is  then  fanned  out,  thresholded  and  sent  to  one  of 
the  label  planes  or  the  cache  memories.  This  is  performed  by  the  fan-out  and  threshold 
units  and  the  optical  logic  plates.  The  input  information  is  provided  by  an  SLM. 

3.  Explanation  of  the  algorithm. 

The  algorithm  has  been  run  on  a  distributed  array  processor  (DAP).  This  parallel  (elec¬ 
tronic)  computer  simulates  the  algorithm  as  if  implemented  in  the  S-SEED  0-CLIP.  The 
two  main  phases  of  the  algorithm  are  given  below  :  the  expansion  of  the  labelling  process 
(section  A)  is  carried  out  in  a  diamond- like  shape  and  the  tracing  back  of  the  shortest 
path(s)  (section  B)  is  performed  up  to  the  starting  point.  The  expansion  phase  checks 
that  the  expanded  wave  does  not  intersect  the  obstacles  and  the  other  label  planes.  The 
tracing  phase  ensures  that  the  shortest  path(s)  is  (are)  found  by  following  the  decreasing 
sequence  starting  from  the  finish  point.  The  path(s)  is  (are)  constructed,  in  real  time,  in 
one  of  the  cache  memories.  The  steps  of  the  algorithm  are  presented  below,  along  with 
the  number  of  S-SEED  switching  periods  required  for  each  step. 

Expansion  phase.  Section  A. 


Until  expansion  reaches  finish  point  Do 

FAN-OUT  (cache  1)  ^  thresholder  h->  fan  storage  unit  (5  steps) 

NAND  (obstacles,  fan  storage)  cache  3  (4  steps) 

NOR  (cache  3,  label  plane  ]-!)>—>  cache  1  (4  steps) 

NOT  (cache  1)  cache  3  (3  steps) 

NOR  (cache  3,  label  plane  j-fl)  f~»  label  plane  j  (4  steps) 

increase  j  by  1 
End  Do 


Start  with  label  plane  containing  finish  point  (assume  label  j) 

Path  back  phase.  Section  B. 

Until  path  reaches  starting  point  Do 

FAN-OUT  (cache  1)  •— >  thresholder  i— >  fan  storage  unit  (5  steps) 

NAND  (label  plane  j-1,  fan  storage)  cache  3  (4  steps) 

NOT  (cache  3)  cache  1  (3  steps) 
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NOR  (cache  2,  cache  1)  cache  3  (4  steps) 

NOT  (cache  3)  ^  cache  2  (3  steps) 

decrease  j  by  1 

End  Do 

Although  conditional  statements  are  explicitly  written  above,  the  algorithm  allows  the 
different  loops  to  be  run  indefinitely.  Alternatively,  the  rapid  fill  algorithm  can  be  used 
to  test  the  validity  of  the  conditional  statements  [4]. 


4.  Efficiency  of  the  algorithm 

The  total  number  of  switching  periods  to  implement  the  Lee  algorithm  depends  on  the 
size  of  the  problem  and  on  the  nature  of  the  logical  devices.  The  parallel  implementation 
of  the  Lee  routine  is  carried  out  with  S-SEEDs  arrays  in  this  example  [5].  The  clocking 
procedure  of  this  device  is  performed  usually  in  three  consecutive  steps.  The  SEED  is 
preset  by  an  optical  beam  to  program  the  logical  operation.  The  signal  beams  write 
onto  the  devices  which  are  subsequently  read  to  propagate  the  information  onto  the 
neighbouring  array.  The  worst  case  in  terms  of  numbers  of  switching  periods  has  been 
analysed  for  both  the  parallel  and  serial  implementation.  The  number  of  switching 
periods  for  the  whole  algorithm  is  less  than  the  sum  of  switching  periods  at  each  step 
because  different  operations  can  be  time  multiplexed  in  the  various  arrays  of  the  0-CLIP. 
The  Lee  routine  necessitates,  as  a  consequence,  only  68x(N-l)  switching  periods  for  an 
NxN  array.  This  is  to  be  compared  to  Nx(N+l)  for  the  serial  implementation  (figure  3). 
The  break-even  occurs  for  a  66x66  array. 


5.  Conclusion. 

A  new  algorithm  for  the  S-SEED  0-CLIP  has  been  presented.  A  DAP  simulation  has 
been  carried  out  and  the  main  parts  of  the  algorithm  were  explained  with  reference  to 
the  DAP  simulation.  The  parallel  implementation  has  been  proved  to  be  more  efficient 
than  its  bit-serial  implementation  for  array  sizes  larger  than  66*66. 
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Abstract.  In  this  paper,  we  devise  a  methodology  for  optical  component  modeling.  We 
present  the  general  description  of  single-port  and  multi-port  optical  components.  Then  we  use 
signal  flow  graphs  to  analyze  optical  architectures  prior  to  the  implementation  phase.  It 
illustrates  the  behavior  of  the  optical  data  signal  at  the  output  of  the  optical  system. 


1.  Introduction 

Development  of  optical  components  have  witnessed  great  interest  in  the  last  few  decades 
[1].  These  components  are  successfully  used  for  optical  computing.  Experimental  and 
practical  results  are  available  in  the  literature  [2,3,4].  To  our  knowledge,  only  [5]  discussed 
modeling  of  some  optical  components  in  a  classical  manner.  Whereas  [6]  developed  a 
methodology  for  a  special  purpose  optical  architecture. 

The  objective  is  to  develop  a  new  methodology  that  can  be  used  to  model  widely 
used  optical  devices.  The  method  should  be  easily  extendible  to  large  optical  architectures 
that  use  these  devices. 


2.  Methodology  of  the  Component  Modeling 

We  discuss  first  the  general  aspects  of  the  modeling  techniques.  In  practice  there  are  three 
generic  classes  of  optical  network  architectures.  Those  that  use  fiber-optic  interconnects 
(fiber-optic  networks),  others  that  use  waveguides  (integrated-optic  networks),  and  third 
that  use  free-space  interconnects  (ffee-space  optic  networks).  The  analysis  of  the 
methodology  presented  here  follow  the  analysis  of  the  microwave  networks,  the  similarities 
stems  from  our  concern  with  the  traveling  waves,  in  which  the  so-called  scattering 
parameters  are  useful.  These  parameters  constitute  the  S-matrix  (scattering  matrix). 
Similarly,  with  optical  components,  we  can  talk  about  the  S-matrix.  Except  that  when  we 
have  optical  components  we  should  introduce  two  single  virtual  ports  instead  of  each 
physical  port.  Therefore,  our  optical  scattering  matrix  is  composed  of  elements  of  2x2 
matrices  rather  than  regular  numbers.  The  S-matrix  of  some  common  optical  devices  such  as 
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the  one-port,  two-port  components  and  directional  couplers  are  available  in  the  literature 

[5]. 


3.  General  Modeling  Concepts 

As  mentioned  earlier,  virtual  ports  have  to  be  introduced  for  the  optical  components  in 
addition  to  the  existing  physical/real  ports.  This  is  necessary  in  order  to  account  for  the 
reflections  or  losses  of  the  data  signals  traveling  through  the  optical  components.  This  might 
add  to  the  complexity  of  the  analysis  but  the  use  of  signal  flow  graphs  simplifies  the  results. 

3.1.  Single-Port  Component 

We  can  describe  the  single-port  component  with  an  equation  relating  the  output  to  the  input 
as: 

A^  =  SA-^C 

where  represents  the  output  port  {sink),  A  is  the  input  port,  S  is  a.  complex  2x2  matrix, 
and  C  is  a  complex  two-element  vector.  The  vector  element  C  has  a  significant  value  only  in 
active  components  (e.g.  optical  sources  and  optical  amplifiers).  It  is  normally  neglected  in 
passive  elements.  The  S  matrix  usually  represent  the  reflection  of  the  optical  component. 
Therefore,  to  represent  an  optical  source  which  has  no  reflections,  we  should  set  S  to  zero. 
This  gives 

A^  =  C 

In  a  signal  flow  graph,  we  can  represent  both  of  these  equations  as  shown  in  figure  1  which 
represents  the  general  configuration  of  the  single-port  connection.  Whereas,  the 
representation  of  the  optical  source  is  the  same  except  that  the  connection  of  the  input  port 
disappears.  Therefore,  we  can  represent  graphically  a  source  without  reflections  with  a 
branch  of  unity  transmission  by  a  single  output  node.  To  represent  an  optical  detector 
graphically,  we  should  realize  that  a  detector  is  a  passive  single-port  component.  An  optical 
detector  with  no  reflection  can  be  represented  in  a  similar  fashion  as  that  in  figure  1 . 


Figure  1.  One-port  component  signal  flow  graph.  Figure  2.  Two-port  component. 

3.2.  Two-Port  Component 

In  examining  how  to  represent  the  two-port  component,  we  follow  the  general  description 
of  the  A'-port  component  which  is  briefly  described  as  follows.  Each  node,  usually  regarded 
as  a  generator,  has  to  have  a  sink.  So  for  node  j,  we  introduce  sinky'  and  for  node  k,  we 
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introduce  sink  k\  Therefore,  for  a  two-port  component,  we  associate  two  nodes  with  each 
port.  Every  input  node  is  connected  with  every  output  node  by  a  directed  branch.  The 
branch  values  are  the  component  iS'-matrix  elements  shown  by  the  value  S's  in  figure  2. 


4.  Optical  Architecture  Modeling 


In  this  section,  we  present  a  simple  architecture,  shown  is  figure  3,  composed  of  one-  and 
two-port  components  and  use  the  above  methodology  to  analyze  the  output  attributes  of  the 
architecture.  The  architecture  is  composed  of  a  source,  mirror,  a  polarized  beam  splitter, 
and  an  interference  filter.  For  simplicity,  and  with  no  loss  of  generality,  we  ignore  the 
quarter  wave  plates  since  they  do  not  affect  the  data  signal  except  in  changing  its 
polarization.  Assume  also  that  a  polarized  beam  splitter  is  an  ideal  two-port  optical 


component. 


A:  Input  signal 
B:  Input  signal 
IF:  Interference  filter 
M:  Mirror 

QWP:  Quarter-wave  plate 
PBS:  Polarized  beam  splitter 
H&L:  Output  ports 


Figure  3.  Optical  architecture  to  be  modeled. 


The  derivation  of  the  signal  flow  graph  corresponding  to  the  optical  architecture  shown  in 
figure  3  is  described  in  figure  4.  Note  that  for  an  ideal  two-port  optical  component,  there  are 
no  reflections.  This  implies  that  Sn  and  iS'22,  iri  figure  2,  are  both  equal  to  zero. 
Furthermore,  a  null  source  can  always  be  removed  without  affecting  the  graph,  such  as  node 
2’  in  figure  4  (b).  Likewise,  a  sink,  such  as  node  7,  may  be  removed  since  its  presence  does 
not  affect  the  value  of  any  other  node.  Note  that  when  a  node  is  removed  from  a  single  flow 
graph,  all  the  branches  connected  to  it  are  removed  as  well.  The  final  signal  flow  graph  is 
thus  given  in  figure  4  (c).  We  then  use  standard  signal  flow  graph  reduction  to  obtain  the 
transfer  function  that  gives  the  relationship  between  the  input  and  output.  The  reduction 
result  of  figure  4  is  found  to  be 


P  S,2  T„.  Tye  (I-  T33.  Tgg.  T3.g)-l  T23' 

In  a  practical  case,  the  values  of  T's  of  the  above  equation  are  known,  then  we  can  compute 
the  information  required  at  the  output,  such  as  power  (intensity),  phase,  and  others.  Work  is 
in  progress  to  find  the  required  mathematical  model  to  solve  for  the  attributes  of  the 
outputs.  In  dealing  with  matrices  (such  as  S-matrix),  a  good  mathematical  package  such  as 
Mathematica  or  MathLab  is  required  to  handle  matrix  inversion  especially  for  large  optical 
systems. 
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5.  Conclusion 

We  have  developed  a  new  methodology  that  can  be  used  to  model  optical  architectures 
based  on  widely  used  components.  Extension  of  this  methodology  to  large  optical 
architectures  is  fairly  simple.  The  objective  is  to  provide  a  tool  to  the  experimenter  to  deal 
with  the  architecture  before  its  implementation.  Work  is  in  progress  for  the  validation  of  this 
methodology. 


S21  2’  3  4’  5  6’  7 

a)  The  signal  flew  graph  of  the  optical  comiponents  before  connections 


S21  2-3  6’  7 

b)  Signal  flow  graph  after  connections 


c)  Final  signal  flow  graph  after  removal  of  irrelevant  nodes  and  branches 

Figure  4.  Reduction  of  the  graph  of  the  optical  architecture  of  figure  3. 
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Abstract.  A  direct  2’s  complement  parallel  array  multiplication  architecture  is 
proposed.  The  algorithm  has  overcome  the  problems  encountered  in  the  conventional 
one.  Using  a  two-stage-array,  complex  multiplication  is  achievable.  Correspondingly, 
a  hybrid  optoelectronic  system  with  free-space  interconnection  is  suggested. 

1.  Introduction 

The  conventional  twos  complement  algorithm^^'^^  has  the  following  defects:  (1) 
The  operands  must  be  encoded  with  the  same  bits  as  needed  for  their  product,  so 
the  range  of  the  result  must  be  known  in  advance;  (2)  The  upper  half  bits  of  the 
product  are  discarded,  hence  this  is  a  severe  waste  of  the  space  bandwidth 
product(SBWP);  (3)  The  intermediate  mixed  2’s  complement  result  is  unable  to  be 
weightedly  summed.  Although  a  modified  version^^^  requires  no  sign-extension  bits 
and  permits  all  processor  channels  to  be  used  for  the  value  encoding,  its 
postprocessing  needs  diverse  sign  decisions  and  irregular  additions.  To  solve  these 
problems,  we  propose  a  direct  2’s  complement  parallel  array  multiplication 
algor  ithm^'^’^^ 

2.  Modified  direct  twos  complement  parallel  array  multiplication  algorithm 

In  2’s  complement  representation,  any  analog  number  x  encoded  by  a  N-bit 
string  (x^.i,  Xn.2,  •••,  x^,  Xq)  takes  the  value 

x=-x^_^-2^-CTx.a\  (1) 

t=0 

and  the  additive  inverse  of  x  equals 

^-2  _  _  (2) 
-X  =  .  E  x.2‘  .  1,  xrl-x,  (i=0,  1,  N-l). 

1=0 

With  the  mixed  2’s  complement  representation,  addition  of  two  numbers  is 
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performed  by  summing  the  corresponding  digits  of  both  strings  without  carries, 
and  subtraction  is  performed  by  summing  the  minuend  and  the  additive  inverse  of 
the  subtrahend. 

Based  on  these  knowledge,  the  direct  parallel  array  multiplication  algorithm  can 
be  derived.  The  product  of  two  N-bit  numbers  u  and  v  is  given  by 

p=  «xv  =  (  e'v.-2'  )  (3) 

i-O  i-0 

N-2N-2  N-2  N-2 

i=0 1=0  i-0  i=0 


-  Ev.-2‘  =  +  E  v,-2''  +  1 


N-2  N-2  _ 

and  -  E  U:-2‘  =  -1-2'^-*  +  E  «,-2'  +  1 


there  exists 


N-2N-2  N-2 

V - .oi+A:  ,  V  / 


The  coefficient  (Un.iVn-i-Un.i-Vn-i)  can  be  transformed  to 


2N-2  1  .o2iV-l 


Therefore  the  uniform  additive  formula  is  advanced: 


N-2N-2 

2N-2^  V  V  „ 


p  =  l<-2^-^)+(l+u^_iV^_i)-22^'-"+  E  E w.v^-2 

j=0  ife=0 


+  E  (m^_iV.+v^_iJ< .)-2'  ^ 
1=0 


The  above  convolution  can  be  realized  in  an  parallel  array  architecture(Fig.  1). 
The  digits  of  the  two  numbers  are  spread  orthogonally.  All  the  bit  products  are 
formed  in  parallel  by  AND-ing  the  inputs  crossing  at  each  cell.  Meanwhile,  all  the 
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partial  products  along  each  diagonal  are  equivalently  weighted  and  are  added 
separately  without  carries,  resulting  in  a  new  2N-digit-long  mixed  twos 
complement  sequence.  The  advantages  are  as  follows:  (1)  The  weighted  sum 
constitutes  the  product  directly.  (2)  We  need  not  know  the  number  of  bits  to 
represent  the  product  before  the  operation  is  conducted.  (3)  The  present  algorithm 
requires  only  (N+2)  channels,  and  all  the  bit  products  contribute  to  the  result. 
These  have  greatly  saved  the  SBWP.  In  addition,  it  is  also  feasible  to  just  use  N 
channels  if  the  last  term  in  Eq.(8)  is  supplemented  by  postprocessing.  (4)  Since 
each  digit  of  the  mixed  product  is  weighted  as  in  2’s  complement  system,  in 
principle,  the  architecture  is  cascadable. 


Figure  1  Figure  2 


3.  Two-stage  array  complex  multiplication 

Assume  two  complex  operands:  a=ai+ja2,  b=bi+jb2,  where  aj,  a2,  bj,  b2  are 
real  numbers  and  are  each  encoded  by  N  bits  in  2’s  complement.  We  use 
superscripts  to  symbolize  the  digit  significance.  In  the  product 
axb  =  (  aibi-a2b2 )  +  j  (aib2  +  a2bi  ), 

the  multiplications  aibi,  aib2,  a2bi  are  performed  in  the  preceding  way,  and  (-a2b2) 
can  be  realized  by  a  different  parallel  bit  product  array  shown  in  Fig.  2  according 
to  the  following  equation,  except  the  terms  b2'2'  (i=0,  1,  •••,  N-2)  left  for 
postprocessing: 


N-2N-2 
i=0  k=0 


The  complex  product  can  be  generated  by  a  two-stage  2x2  array  in  perfectly 
parallel(see  Fig. 3).  In  the  (-a2b2)  subarray,  the  inputs  related  to  a2  are  negated  in 
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comparison  with  the  left  a2bi  subarray. 


4.  Optical  implementation 

The  two-stage  array  architecture  has  been  verified  by  experiment.  The  optical 
construction,  which  comprises  three  parts,  is  illustrated  in  Fig.  4.  The  first 
imaging  part(P-Q)  forms  the  four  product  subarrays.  The  second  multiple -imaging 
system^^^(Q-R)  carries  out  addition  of  two  of  the  relevant  subarrays.  The  incoherent 
correlator f^^(R-S)  integrates  the  diagonal  elements  to  get  both  real  and  imaginary 
parts  of  the  result.  The  system  is  interfaced  with  a  PC  computer  for  pre-  and  post¬ 
processing. 
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Abstract:  A  new  technique  for  high-speed  recoded  trinary  signed-digit  (TSD)  arithmetic 
using  optical  symbolic  substitution  is  proposed.  This  technique  performs  multi-digit 
carry-free  addition  and  borrow-free  subtraction  in  constant  time  using  only  39%  of  the 
minterms  required  in  the  most  recently  reported  TSD  arithmetic  technique. 


1  Introduction 

The  problems  of  achieving  massively  parallel  optical  computing  have  been  investigated  by 
many  authors  using  the  residue  number  system,  signed-digit  number  system,  and  recoded 
signed-digit  number  systems  [1-4].  Using  residue  number  system,  one  can  perform  parallel 
arithmetic  in  constant  time  using  symbolic  substitution  (SS)  [5],  but  the  size  of  the  truth  table 
required  increases  rapidly  with  the  increase  of  the  operand  length.  The  redundant  signed-digit 
number  system  [1]  allows  parallel  arithmetic  with  fewer  carry  propagation  steps.  Higher  radix 
signed- digit  number  systems  allow  higher  information  storage  density,  less  complexity,  fewer 
system  components,  and  fewer  cascaded  gates  and  operations.  Among  the  higher  radix  number 
systems,  the  trinary  signed-digit  number  system  appears  to  be  the  most  promising  in  terms  of 
storage  density  and  processing  elements  [6]. 

Recently,  a  two-step  symbolic  substitution  (SS)  technique  for  TSD  arithmetic  has  been 
reported  [6].  This  technique  performs  carry- free  addition  and  borrow-free  subtraction  by 
checking  a  pair  of  reference  digits  from  the  next  lower  order  digit  position  and  requires  58 
four- variable  minterms  for  each  output  digit.  Later  on,  a  higher-order  TSD  SS  technique  has 
been  designed  [4]  which  enhances  the  computation  speed  at  the  expense  of  increasing  the  num¬ 
ber  of  six- variable  minterms  per  output  digit.  A  simpler  carry-free  addition  scheme  [7]  where 
the  binary  signed-digit  numbers  are  recoded  before  performing  the  addition  has  recently  been 
proposed.  Awwal  [8]  implemented  the  recoded  binary  signed-digit  arithmetic  technique  using 
an  opto-electronic  implementation  which  requires  sixteen  four-variable  minterms  per  output 
digit.  Thus,  in  any  SS  scheme,  the  most  important  objective  is  to  minimize  the  number  of 
minterms  (substitution  rules)  and  computational  steps  while  incorporating  the  optimum  in¬ 
formation  in  fewer  digits.  To  achieve  these  objectives  without  sacrificing  the  processing  speed 
and  parallelism  of  optics,  we  propose  herein  the  recoded  TSD  system  for  high-speed  multi-digit 
carry-free  addition  and  borrow-free  subtraction. 

2  Recoded  TSD  Arithmetic 

A  TSD  literal  may  be  represented  as  a  member  of  the  set  {2,  T,  0,  1,  2}  with  2  and  T  representing 
-2  and  -1  respectively.  When  two  TSD  numbers  are  added,  a  carry  will  be  generated  for  the 
digit  combinations  22,  2l,  l2,  12,  21,  and  22.  For  carry-free  addition,  the  aforementioned 
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digit  combinations  must  be  eliminated  from  the  augend  and  addend  operands.  By  exploiting 
the  redundancy  of  the  TSD  numbers,  it  is  possible  to  eliminate  the  occurrence  of  the  above 
mentioned  digit  combinations  through  a  recoding  truth-table  as  shown  in  Table  I.  An  n-digit 
TSD  number  P  =  Pr,  Pn-i  Pn-2  ■  •  •  Pi  Po  when  recoded  using  Table  I  results  in  an  (n  + 
l)-digit  recoded  TSD  number  Q  =  Qn+iQnQn~iQn-2  QiQo  such  that  both  P  and  Q 
are  numerically  equal  and  the  digits  in  Q  satisfies  the  condition:  Qi  X  Qt—i  ^  {2,  4},  0  < 
i  ^  ^  1),  In  Table  I,  four  TSDs  are  investigated  to  output  a^recoded  TSD.  With  four 

TSDs,  it  is  possible  to  generate  a  number  over  the  range  of -80io  (22  2  23)  to  +80io  (2  2  2  23). 
For  space  limitation,  in  Table  I,  we  have  shown  only  those  input  combinations  within  the 
range  -80  to  -1  (referred  to  as  the  negative  category)  which  yields  non-zero  outputs.  The 
remaining  entries  representing  the  range  Tl  to  4'80  (referred  to  as  the  positive  category)  may 
be  obtained  by  complementing  the  corresponding  negative  category  entry.  For  illustration, 
if  the  first  entry  (2  21  23  =  -75io)  of  Table  I  is  complemented,  we  obtain  the  corresponding 
digits  (22T23  =  75io)  of  the  positive  category  entry.  Thus,  in  Table  I,  the  input  minterms 
and  their  corresponding  output  digits  have  a  complementary  relationship.  Also,  the  output 
of  Table  I  does  not  include  the  2  and  2  literals.  Therefore,  the  recoding  operation  maps  the 
TSD  set  {2,  T,  0,  1,  2}  into  a  smaller  set  {I,  0,  1}.  The  addition  truth-table  corresponding  to 
the  recoded  TSD  output  of  Table  I  is  shown  in  Table  II,  where  Ai,  Bi  and  Si  represent  the 
addend,  augend  and  sum  operands,  respectively.  Table  II  represents  a  simplified  TSD  addition 
truth-table  since  it  does  not  incorporate  any  minterm  involving  the  2  and  2  literals.  Note  that 
it  is  necessary  to  pad  three  zeros  trailing  the  least  significant  digit  and  one  zero  preceding 
the  most  significant  digit  in  order  to  apply  Table  I  to  recode  a  TSD  number.  To  show  the 
application  of  Table  I  and  Table  II,  consider  the  following  example  for  TSD  addition: 


Operand 

TSD 

Recoded  TSD 

Decimal 

type 

representation 

representation 

representation 

1 

i 

i 

i 

Addend 

=  (jl221221  1  pf  ^pf)3  = 

(10100111)3 

=  (1951)io 

Augend 

=  (pfTl21222  ^pf  jl)3  = 

(OOIT01IT)3 

=  (-319)10 

Final  sum 

=  = 

(102X0210)3 

0 

cT 

CO 

CO 

11 

Final  carry 

(000  00000)3 

=  (0)io  1 

where  0  indicates  a  padded  zero.  The  above  results  clearly  illustrate  that  the  addition  of  the 
recoded  TSD  numbers  is  carry-free. 


3  Optical  Implementation 

Among  the  various  SS  architectures  proposed  for  optical  implementation  of  the  binary  and 
signed-digit  arithmetic  [2-4,  6,  9-12],  the  direct  truth-table  content-addressable  memory  (CAM) 
based  approach  that  precalculates  and  stores  the  processing  results  [11]  appears  to  be  the  most 
promising  in  terms  of  the  computation  speed.  To  design  a  system  with  the  optimum  memory 
requirement  but  relatively  high  speed,  either  holographic  or  nonholographic  CAM  based  opti¬ 
cal  implementation  as  detailed  in  References  10  and  12  may  be  used.  The  first  step  to  design 
a  CAM  is  to  minimize  the  truth  table  and  obtain  the  minterms  corresponding  to  the  non-zero 
entries  of  the  desired  output.  The  reduced  minterm  sets_of  the  recoding  truth  table  are  Qi  e 
{d,  2  22dYoi2»  dj2ll^^Toi2’  021dYoi2’  ^  1  «^ioT2’  1  ^10121^12  ^  01 2^ 

dj2  22dioY2’  ‘^12  1^012  4^  dj  2  Ido  12  02doi2d,  d^^212,  d^^l22,  d^^l22, 

d^^212,  dj^222,  ^12  112,  di2  0dd,  02  22,  01  12}  and  G  {dY2  2  2^^012’  dY2lldioY2, 
0T2(i.oi2»  021djoY2,  dj221dYoi25  d^2i:ldYoi2’  di2l2dYj)i2’  d^i  2  2  ^yoi  2?_di2  1  doY2  d, 
02doY2d,  dY2ldoi2d,  d^22doi2d,  dY2212,  dY2l22,  dY2l22,  dY2212,  ^^^222,  dj2ll2, 
d-  0  dd,  0  22  2,  0  1  1  2},  respectively.  Thus,  42  four- variable  {Pi  Pi-\  Pi-2  Pt-3)  minterms  are 
required  to  generate  the  1  and  I  outputs  of  Qi.  Notice  that  the  minterms  for  yielding  the  1 


Table  I.  Recoding  Truth-Table 
for  TSD  Numbers. 


Table  II.  Addition  Truth-Table 
for  Recoded  TSD  Numbers. 


Fig.  1.  Optical  implementation  for  the  recoded  trinary 
signed-digit  arithmetic  using  CAM. 
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and  1  outputs  of  Qi  are  exact  complements  of  each  other.  For  illustration,  when  the  reduced 
minterm  generating  the  1  output  oi_Qi  is  complemented,  we  obtain  the  corre¬ 

sponding  minterm  ^^3  2  2^^012  generating  the  1  output_of  Qi.  For  the  second  processing 
step  using  Table  II,  only  two  minterms  01  and  10  (0  1  and  1  0)  are  required  to  generate  the  1 

(I)  output,  and  only  one  minterm  11  (TT)  is  required  to  generate  the  2  (2)  output  of  Si.  Also, 
the  minterms  for  S^  =  1  (2)  are  digit-by-digit  complement  of  the  corresponding  minterms  for 

—  Y  ^2).  Thus,  21  minterms  are  required  to  generate  the  1  output  of  Ci,  and  3  minte^s 
are  required  to  generate  the  1  and  2  outputs  of  S^.  In  the  quad-rail  coding  scheme,  a  TSD  ti  is 
ISO'^  rotated  version  of  U.  Accordingly,  the  minterms  corresponding  to  the  1  output  of  Qi,  and 
the  I  and  2  outputs  of  Si  can  be  generated  by  using  a  slightly  dilferent  geometric  configuration 
employing  mirrors.  Therefore,  the  whole  system  can  be  implemented  with  21  four- variable 
minterms  for  recoding  and  3  two-variable  minterms  for  generating  each  final  output  digit. 
It  may  be  mentioned  that  a  two-variable  minterm  occupies  only  half  of  the  encoding  space 
occupied  by  a  four-variable  minterm.  The  most  recently  reported  two-stage  implementation 
for  TSD  addition  employed  58  four- variable  minterms  [6].  Thus,  the  recoded  TSD  technique 
requires  only  39%  of  the  minterms  used  in  Reference  6  for  TSD  arithmetic. 

The  proposed  technique  can  be  implemented  using  a  holographic  or  nonholographic  CAM- 
based  SS  system  [10,  12].  A  holographic  CAM  requires  coherent  light  and  suffers  from  its 
inconvenient  recording  problems.  On  the  other  hand,  a  nonholographic  CAM  can  be  imple¬ 
mented  with  coherent  or  non-coherent  optical  processing.  Because  of  the  ease  of  alignment  and 
implementation,  in  comparison  to  a  holographic  CAM,  a  nonholographic  CAM  approach  may 
be  used  for  the  proposed  technique.  The  optical  implementation  for  the  proposed  technique  is 
shown  in  Fig.  1  where  the  inputs  are  recoded  electronically  and  the  addition  step  is  performed 
optically  to  generate  the  final  output.  Note  that  only  the  minimized  minterms  for  the  1  and 
2  outputs  of  the  addition  step  are  stored  in  the  nonholographic  CAM  while  the  minterms  for 
the  T  and  2  outputs  are  generated  by  the  beam  splitter  and  mirror  (m2)  combination. 

4  Conclusion 

We  have  proposed  an  efficient  trinary  signed-digit  arithmetic  technique  for  performing  parallel 
carry-free  addition  and  borrow-free  subtraction  in  a  constant  time  in  two  steps  independent 
of  the  digit  string  length.  This  technique  requires  the  minimum  number  of  minterms  when 
compared  to  the  previously  reported  trinary  signed-digit  arithmetic  techniques  [4,  6].  Finally, 
a  CAM  based  optical  implementation  is  suggested  for  the  proposed  scheme. 
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Abstract 

Two  digital  optical  computing  architectures  are  being  developed  for  compute 
intensive  applications.  A  32-bit  digital  optical  processor  has  been  implemented 
in  hardware.  The  machine  is  called  DOC  II  {iiigital  liptical  computer).  The 
platform  implements  the  (parallel)  architecture  at  an  interconnect  density  of 
8192  and  a  clock  rate  of  100  MHz  resulting  in  an  gate-interconnect-bandwidth- 
product  of  0.8 192  X  10^1  Development  efforts  are  underway  to  implement  the 
N'*  (global)  smart  pixel  architecture  within  miniature  high  nerformance 
optoelectronic  computing  (HPOC)  modules.  Objectives  include  the  fabrication 
of  10^  interconnect  modules  at  clock  rates  up  to  1  GHz. 


1.0  parallel  architecture 

The  architecture  for  DOC  II  is  referred  to  as  the 
architecture.  It  is  a  Boolean  vector/matrix 
multiplication  unit  as  shown  in  Figure  1. 
Equation  (1.)  relates  the  Boolean  summation  of  an 
I  element  vector,  Xj  ,  (I-»N)  against  a  control 
microcode  matrix  ^  which  contains  on  the  order  of 
N"  terms  I*J. 


=n^  (1.) 

'^1—1  -^1=1  ^  T7-ir\f'lT>l  RinctioruJf 

Figure!:  Optical  Boolean  S 

In  actuality  i  goes  to  I  which  is  on  the  order  of  vector/matrix  architecture 
N  but  j  is  arbitrary  depending  on  the  number  of  ° 

detectors  and/or  the  size  or  availability  of  the  control  operator  hardware.  If  then  the  gate 
interconnect  bandwidth  product  for  this  architecture  is  where  y  is  a 

proportionality  constant  depending  on  J  (the  limit  of  j  which  equals  a  in  Figure  1.),  and  B 
is  the  bit  rate.  Since  the  summation  on  the  detector  is  strictly  Boolean,  i.e.  a  threshold  is 
performed  strictly  between  the  zero  light  state  and  a  FAN-IN  of  1,  then  by  DeMorgan’s 
theoremf^  the  output  is  represented  by  the  product/multiplication  as  indicated  by  equation 
(1.).  Consequently,  a  Boolean  minterms  or  functionals  on  a  given  data  input  vector  Xj  may 
be  formed  on  the  detector  array  at  each  clock  cycle.  A  second  pass  through  the  system  allows 
for  the  summation  of  the  functionals  to  form  partial  or  complete  instructions  depending  on 
the  complexity  of  the  instruction. 


The  hardware  consists  of  two  primary  assemblies  (Figures  2  and  3);  the  illumination 
assembly  (Train  A)  and  the  modulation  relay  assembly  (Train  B).  Both  trains,  which  are 
optimized  for  maximum  throughput  at  837  nm,  are  positioned  on  a  36"  x  48"  optical  table 
which  is  packaged  in  a  optoelectronic  cabinet.  During  each  clock  cycle  the  hardware 


DOC  II  Layout 

Beam  Miillptexing  Msembfy 
Ftfst  Anamorpihic  Reby 


Spatial  Light  Modulator 


Optical  Table 


Secoord  Anamorphic  Relay 


provides  up  to  128  functionals  ^as, 
on  a  given  input  vector  of  up  to 
32  bits  (dual  rail)  in  length. 

This  is  provided  by  64 
independently  modulated  input 
edge  emitting  laser  diodes 
arranged  in  groups  of  eight 
elements.  The  control  mask  is 
clocked  into  the  spatial  light 
modulator  (64  x  128  input 
control  data  matrix).  The  spatial 
light  modulator  consists  of  a  64 
channel  GaP  acousto-optic 
Bragg  cell.  Each  channel  is  Figure  2;  doc  n  layout 

centered  at  800  MHz  with  a  apd  Array 

bandwidth  of  400  MHz  and  a 
time  aperture  of  2.56  |is.  Only  | 
an  aperture  of  1.28  |Xs  is  utilized.  : 

The  output  consists  of  a  128 
element  linear  avalanche 
photodiode  array  followed  by  a 
transimpedance  receiver  and 
ECL  thresholds.  For  further 
reading  on  this  hardware  the 
reader  is  referred  to  references 
2  and  3.  Thus,  the  processor 
can  achieve  an  optical  data 
carrier  fan-out  of  1:128  and  a 
control  logic  fan-in  of  64:1. 

Figure  3:  Photograph  of  IXXT  11  layout 

Current  compute  intensive  applications  including  RISC  emulation,  full  text  data  searches, 
and  multimedia  database  development  have  been  demonstrated. 


■C'.,  °  ‘'m 
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2.0  global  architecture 


The  architecture  for  DOC  III  is 
referred  to  as  the  N'* architecture.  It 
is  a  Boolean  matrix/tensor 
multiplication  unit  as  shown  in 
Figure  4.  Equation  (2.)  relates  the 
Boolean  summation  of  matrix,  x,., 
which  contains  on  the  order  of 
terms,  I*J,  (i->I,  j^J)  against  a 
control  microcode  tensor  which 

contains,  in  optical  memoiy,  on  the 
order  of  N'*  terms,  I*J*K*L. 


=  i  i 

i  =  1 j  =  1 


=  fi  ri 

i=l  j=l 


*ij  ^ij.U 
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Figure  4;  Single  stage  HPCKI  module. 

Input  VeSEL  array  is  followed  by  2D  array 
of  diffractive  optical  elements  which  foim 
N4  interconnects  to  subseqxient  receiver 
amplifier  arrays  and  VCSEL  re-emitters. 


(2.) 


39 


In  actuality  i  goes  to  I  which  is  on  the  order  of  N  but  k  is  arbitrary  depending  on 
the  number  of  detectors  and/or  the  complexity  or  availablilty  of  the  control  operator 
hardware.  In  addition,  j  goes  to  J  which  is  on  the  order  of  N  but  I  is  arbitrary  depending 
on  the  number  of  detectors  and/or  the  complexity  of  the  control  operator  diffractive 
optical  element.  If  and  then  the  gate  interconnect  bandwidth  product  for  this 
architecture  is  B*i*j*k*I=7*B*N‘*,  where  y  is  a  proportionality  constant  depending  on  I, 
J,  K,  and  L,  where  B  is  the  bit  rate.  Since  the  summation  on  the  detector  is  again 
strictly  Boolean,  i.e.  a  threshold  is  performed  strictly  between  the  zero  light  state  and  a 
FAN-IN  of  1,  then  by  DeMorgan's  theorem,  the  output  is  represented  by  the  matrix/ 
tensor  product  as  indicated  by  equation  (2.).  Consequently  K*L  Boolean  minterms  or 
functionals  on  a  given  data  input  matrix  x,j  may  be  formed  on  the  detector  array  at  each 
clock  cycle. 

The  cascade  of  a  second  HPOC  module,  as  shown  in  Figure  5,  allows  for  the 
summation  of  the  functionals  to  form  partial  or  complete  instructions  depending  on  the 
complexity  of  the  instruction. 

2D  Array 


Y 

mn 


The  second  HPOC  module  provides  for  the  double  summation  of  the  functionals 
computed  from  the  first  HPOC  module.  The  output  of  the  double  stage  cascade,  , 
represents  complete  or  partial  instructions  computed  depending  on  the  length  of  the 
instruction  encoded  into  the  diffractive  optical  elements  within  each  HPOC.  As  can  be  seen 
in  equation  (3.)  the  double  stage  cascade  produces  a  four  dimensional  Boolean  matrix/ 
tensor  multiply  followed  by  a  Boolean  matrix/tensor  addition. 


K  L 

=  I  I 

k= 1 1= 1 


n  n 

i=ij=i 


kl,mn 


(3.) 


This  is  required  to  fulfill  the  sum-of-products  formulation  as  required  by  Shannon's 
generalized  digital  computation  theory. 
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The  hardware  being  developed 
for  HPOC  module  implementation  is 
shown  Figures  5  and  6.  Each  module 
is  being  developed  as  ahybrid  structure 
consisting  of  five  elements.  The 
transmitter  arrays  consist  of  vertical 
cavity  surface  emitting  laser  diodes 
(VeSELs).  Initially  these  GaAs  arrays 
are  electrically  connected  at  the  edges 
through  ela;hical  FAN-IN  from  die 
receiver  mays  via  wire  bonds.  The 
laser  arrays  are  epoxied  to  a  ceramic  substrate.  The  receiver  arrays  are  designed  in  GaAs  withsietal- 
semicondiKtor-metal  (MSM)  detectors  followed  by  an  E/D  MESFET  transimpedance  amplifiCT  and 
laser  driver  stage.  The  diffractive  optical  elements  are  mounted  above  each  of  the  GaAs  sttuctures 
and  physically  bonded  to  beam 
folding  optics  which  allows  for 
semi-planar  implementation  as 
shown  in  Figure  6.  Here  a  3x3  array 
of  four-stage  cascade  HPOC 
modules  are  mounted  on  a  board. 

Each  four-stage  HPOC  island 
consists  of  two  arrays  of  VCSELs 
and  two  arrays  of  detection, 
amplification,  negation  and  driver 
stages.  More  complete  descriptions 
of  this  hardware  may  be  found  in 
references  [5-7]. 
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Figure  5:  Single  stage  HPCXT  nKxiule  hardware  configuratioo.  Beam 
folding  optics  provide  semi-planar  impl^entation  of  hybrid  GaAs 
transmitter/rcccivCT  structures  coupled  by  diffractive  optical  int^con- 
nect  elemaits  to  provide  N"*  free  space  optical  intercom>«:ts. 
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Abstract:  A  massively  parallel  processing  system  with  a  reconfigurable  shift-invariant  optical 
interconnection  among  electronic  general  purpose  processing  elements  (PE’s)  is  described.  Each 
PE  is  so  compact  that  more  than  4,000  PE’s  can  be  integrated  into  one  chip  for  directly 
coupling  with  array  type  optical  devices  in  parallel.  The  optical  interconnection  is  constructed 
by  using  a  suface  emitting  laser  diode  array  and  a  phase  modulation  type  spatial  light  modulator 
on  which  optimized  computer  generated  holograms  are  written.  In  this  paper,  the  design  concept 
of  the  system  and  the  PE,  configration  of  an  experimental  system,  and  algorithm  for  the  parallel 
optoelectronic  computing  system  with  some  applications  are  shown. 


1.  Introduction 

Integrated  optical  devices  such  as  surface  emitting  laser  diode  arrays,  photo  detector 
arrays  and  OEIC’s  promise  high  performance  parallel  processing.  Although  these  devices 
have  two-dimensional  parallelism  for  pattern  information  processing,  their  potential 
capabilities  are  not  utilized  in  conventional  parallel  processing  systems.  The  "I/O 
bottleneck"  between  processing  element  (PE)  array  and  I/O  devices  sets  the  limit  to  the 
processing  speed. 

In  order  to  overcome  the  I/O  bottleneck,  two-dimensional  interconnections  between 
them  are  required.  In  other  words,  it  is  expected  that  high  performance  parallel  computing 
systems  can  be  realized  by  using  such  two-dimensional  optical  devices  and  fine  grain 
electronic  PE  arrays.  Although  processing  architectures  for  utilizing  such  devices  are 
intrinsically  based  on  massively  parallel  processing  with  parallel  optical  input  and  output, 
the  interconnects  can  not  be  implemented  by  using  conventional  macro-scale  wiring 
technology.  Only  the  integrated  wiring  technology  such  as  VLSI,  flip  chip  bonding,  and 
GaAs  on  silicon  can  implement  the  interconnection.  This  means  that  the  PE  should  be  so 
compact  to  be  implemented  with  integrated  optical  devices.  Integrated  PE  and  optical 
device  arrays  realize  fully  parallel  processing  with  high  computing  performance  for  some 
application  fields  such  as  high  speed  visual  image  processing,  robot  vision,  automated 
visual  inspection,  and  pattern  recognition. 

In  this  paper,  the  design  concept  of  the  system,  configuration  of  an  experimental 
system,  and  algorithms  for  the  parallel  optoelectronic  computing  system  are  shown. 
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2.  Parallel  optoelectronic  processing  systems 


Convention^d  micro¬ 
processors  can  not  be  used 
as  the  PE  of  the  system, 
because  the  number  of  gates 
is  too  large  that  many  PE’s 
cannot  be  implemented  into 
small  area  on  VLSI’s.  Keys 
of  the  design  are  how 
generality  of  processing  is 
kept  by  using  small  number 
of  gates  and  how  the  inter¬ 
connection  is  implemented. 

Based  on  this  design 
concept,  Ishikawa  et  al. 
have  proposed  a  fully  par¬ 
allel  hierarchical  processing 
architecture  and  a  compact 
PE  architecture  for  general 
purpose  processing  [1].  A 
conceptual  diagram  of  the 
architecture  is  shown  in  Fig. 
1.  Each  layer  in  this  archi¬ 
tecture  consists  of  the  same 
module  with  the  two- 
dimensional  PE  array  and 
optical  I/O  devices  and  is 
optically  interconnected  with 
neighbor  layers.  Since  the 
processing  of  each  layer  is 
independent  of  the  process¬ 
ing  of  other  layers,  different 
programs  can  be  carried  out 
at  each  layer. 

Since  the  architecture 
is  difficult  to  be  imple¬ 
mented  as  an  experimental 
system  because  the  system 
requires  many  devices,  an 
equivalent  artitecture  by 
time  sharing  algorithm 
which  carries  out  the 
processing  of  each  layer  in  a 
time-sequential  order  has 
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been  proposed.  A  conceptual  diagram  of  the  feedback  type  architecture  is  shown  in  Fig.  2. 
Although  the  PE  has  electrical  interconects  with  the  four  nearest  neighbours,  reconfigurable 
optical  interconnects  are  used  for  realizing  wade  range  of  interconnection  in  order  to 
maintain  the  generality  of 


mtercormection. 


Fig.  3  Block  diagram  of  processing  element 


2. 1  Processing  Element  (PE) 

In  order  to  implement 
the  optoelectronic  parallel 
processing  system  shown  in 
Fig.  2,  Ishikawa  et  al.  de¬ 
signed  a  compact  and  general 
purpose  PE.  The  block  dia¬ 
gram  of  the  PE  is  shown  in 
Fig.  3  [1]. 

The  architecture  of  the 
PE  has  following  features  to 
be  compact;  1)  direct  connec¬ 
tion  between  PE  and  optical  I/O,  2)  SIMD  type  parallel  processing,  3)  controlled  by  micro¬ 
instruction,  4)  bit  serial  processing,  5)  restricted  electrical  interconnects  (4  neighbours). 

Each  PE  has  three  8bit  registers  (A: Accumulator,  T: Template,  W: Weight),  one 
arithmetic  logical  unit  (ALU,  Ibit),  and  one  4-bit  multiplier  as  shown  in  Fig.  3.  Processing 
architecture  of  the  ALU  is  based  on  bit  serial  processing  which  is  slow  in  comparison  with 
bit  parallel  processing,  but  dose  not  require  so  many  number  of  gates,  then  it  has  major 
advantages  for  integration  and  variable  bit  length  processing.  Functions  of  ALU  include 
AND,  OR,  Exclusive  OR,  addition,  subtraction,  multiply  (4bits  X  4bits)  and  combinations 
of  these  basic  functions  such  as  weighted  sum  for  calculation  of  correlation. 

Most  important  specification  of  the  PE  is 
the  number  of  gates.  In  the  result,  the  PE  is 
implemented  by  using  337  gates.  The  number 
of  the  gates  is  quite  small.  Processing  cycle 
time  is  typ.  44ns,  max.  87ns. 

2.2  Scale  up  model :  SPE-4k 


The  PE  have  been  designed  to  be  so 
compact  that  more  than  64  X  64=4096  PE’s 
may  be  integrated  mto  one  chip  by  using 
present  VLSI  technology.  An  experimental 
scale  up  model  with  64  X  64=4096  PE’s  have 
been  also  developed.  The  system  is  named 
SPE-4k  (Sensory  Processing  Elements  -  4k). 
The  system  is  regarded  as  a  scale  up  model  of 
one  chip  optoelectronic  processing  layer  shown 


Fig.  4  SPE-4k 
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in  Fig.  1  and  Fig.  2.  Tlie  overview  of  SPE-4k  is  shown  in  Fig.  4.  PE’s  which  are 
implemented  by  using  gate  arrays  are  arranged  between  the  front  LED  array  and  the  back 
PTR  array. 

Considermg  the  8bit  integer  addition  as  a  basic  operation  of  processing  for  the 
evaluation  of  the  speed  of  SPE-4k,  total  32MOPS  (Mega  Operations  Par  Second)  by  the 
present  system  and  3.2GOPS  (Giga  Operations  Par  Second)  at  the  maximum  speed  of  the 
system  are  obtamed. 

2  5  Reconfigurable  optical  interconnection 

TTie  PE  has  electronic  interconnects  with  four  neighbors,  but  other  interconnects 
require  iterative  operations  which  takes  a  long  processing  time,  when  only  electronic 
interconnects  are  used.  In  order  to  directly  connect  between  arbitrary  PE’s.  Kirk  et  al.  have 
designed  a  reconfigurable  optical  interconnection  [2]  and  showed  basic  experimental  results 
[3]  using  a  phase  modulation  type  spatial  light  modulator,  PAL-SLM  (Parallel  Aligned 
Spatial  Light  Modulator)  developed  by  Hamamatsu  K.K.  [4]. 

The  interconnection  subsystem  is  based  on  computer  generated  hologram  (CGH) 
optimized  by  usmg  simulated  annealing  algorithm.  It  realizes  shift  invariant  interconnection 
for  SIMD  type  parallel  processing.  Reconfigurability  of  the  interconnection  is  the  most 
important  for  realizing  generality  of  processing.  It  is  implemented  by  rewriting  the  CGH  on 
the  PAL-SLM  by  using  liquid  crystal  display  controlled  by  a  personal  computer. 
Consequently,  practical  functions  of  interconnection  are  ob-  tained  and  demonstrated. 

Recently,  a  surface  emitting  laser  diode  (SELD)  array  for  integration  of  the  optical 


_ _ ,SELD  array 

(NEC  7x8) 


SELD:  surface  emitting  laser  diode 
CGH:  computer  generated  hologram 
SLM:  spatial  light  modulator 
LCD:  liquid  crystal  display 
BS:  beam  splitter 
PD:  photo  detector 
PE:  processing  element 
SPE:  sensory  processing  elements 


reduction 
optics  LCD 


PD  array(C4675} 


SPE 


Fig.  5  Parallel  processing  system  with  reconfigurable  optical  interconnection 


interconnection  is  examined  m  the 
system  and  the  computing  capabil¬ 
ities  of  the  whole  system  for  parallel 
processing  including  usage  of  the 
reconfigurable  inter-  connection  are 
demonstrated.  The  block  diagram  of 
the  system  is  shown  in  Fig.  5.  The 
SELD  array  used  in  the  system, 
which  is  developed  by  NEC  Corp. 

[5],  is  based  on  VC-VSTEP  tech¬ 
nology  and  has  7X8  LD’s  (wave¬ 
length;  975nm,  output  power;  ImW, 
pitch;  \25  Um)  m  2i  chip.  In  order  Fig.  6  Experimental  system 

to  match  with  the  SELD  array,  the 

PAL-SLM  used  in  the  system  is  _  _ 

developed  for  that  wavelength.  The 
PD  array  is  16  X  16  PIN  photo 
diode  array  which  has  parallel 
amplified  output.  The  sensitivity  of 
the  PD  array  for  the  wavelength  is 
about  0.15AAV.  By  using  these 
optoelectronic  devices,  fully  parallel 
operation  without  scanning  operation 
has  been  implemented.  The  photo¬ 
graph  of  the  system  is  shown  in  Fig. 

6.  In  the  system,  16  X  16  =  256  PEs 
are  implemented  as  parallel  PEs  in  Fig.  7  Optical  subsystem 

the  SPE.  The  part  of  optical  system 

is  shown  in  Fig.  7.  The  PAL-SLM  modulates  the  light  from  the  SELD  array. 

A  liquid  crystal  computer  display  (640  X  400)  for  OHP  is  used  as  the  display  r  the 
CGH.  It  can  be  controlled  in  pixel  wise  by  a  computer.  An  example  of  intercoimection 
pattern  is  shown  in  Fig.  8.  This  pattern  is  one-to-four  interconnects  from  four  LD’s.  One 
column,  which  includes  four  spots,  is  inter- 
connected  with  the  same  LD. 

2. 4  Parallel  processing  algorithm 


In  order  to  evaluate  processing  performance 
of  the  system,  parallel  computation  of  matrix- 
vector  products  as  an  example  have  been  carried 
out  on  the  system.  In  the  case  that  data  have  been 
already  loaded  to  the  PE’s,  a  flowchart  of  the 
calculation  is  shown  in  Fig.  8.  The  calculation 
requires  three  interconnection  patterns  shown  in 
Fig.  8.  In  the  present  system,  changing  time  for 


Fig.  7  Optical  subsystem 
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Fig.  8  Example  of  interconnection 
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Fig.  9  Algorithm  of  matrix- vector  products  and  the  inter  connection  patterns 


interconnection  pattern  is  not  so  fast  because  of  the  speed  cf  the  LCD.  Except  this  point, 
the  speed  is  limited  by  the  speed  of  the  PD  array  (about  19kHz)  because  of  high  gain 
amplifiers.  The  nember  of  processing  steps  is  decreased  to  the  half  of  the  case  without  the 
optical  interconnection. 


3.  Conclusion 

Parallel  processing  architecture  for  realizing  an  integrated  optoelectronic  system  and 
some  experimental  results  are  shown.  Reconfigurable  diffractive  interconnects  using  a 
surface  emitting  laser  diode  array  and  a  phase  only  modulation  type  spatial  light  modulator 
are  implemented  and  demonstrated.  The  system  has  general  purpose  computing  capabilities 
using  die  compact  programmable  processing  elements  and  the  reconfigurable  inter¬ 
connection.  The  design  concept  of  the  system  can  lead  to  an  integrated  optoelectronic 
parallel  computing  system  that  is  a  real  smart  pixel  system 

The  author  thanks  A. Kirk  of  Vrije  Universiteit  Brussel  for  his  contribution  when  he  was  a  research 
fellow  at  University  of  Tokyo,  and  also  thanks  T.Tabata,  T.Ishida,  M.Naruse,  H. Yamamoto,  Y.Nakabo,  and 
N.Tcrada  of  University  of  Tokyo  for  their  assistance.  As  for  the  devices  in  the  system,  the  author  th^iks 
K.Kasahara  and  his  research  group  of  NEC  Corporation  for  their  kind  support  in  supplying  SEED  array,  and 
thanks  Y.Suzuki,  T.Hara,  and  their  group  of  Hamamatsu  Photonics  K.K.  for  their  assistance  in  setting  the 
PAL-SLM. 

References 

[1]  Ishikawa,  M.,  Morita,  A.  and  Takayanagi,  N.,  1993,  Optical  Computing  Technical  Digest  1993 
(Optical  Society  of  America,  W^hington,  D.C.),  7,  272-273. 

[2]  Kirk,  A.,  Tabata,  T.,  Ishikawa,  M.  and  Toyoda,  H.,  1994,  Opt.  Comm.,  105,  302-308. 

[3]  Kiik,  A.,  Tabata,  T.  and  Ishikawa,  M.,  1994,  Appl.  Opt.,  33,  1629-1639. 

[4]  Yoshida,  N.,  et  al.,  1993,  Spatial  Light  Modulator  and  Applications  Technical  Digest  1993  (Optical 
Society  of  America,  Washington,  D.C.),  6,  97-99. 

[5]  Kajita,  M.,  et  al.,  1994,  Jpn.  J.  Appl.  Phys.,  33,  859-863. 


47 


Insl.  Phys.  Conf.  Ser.  No  J39:  Pari  I 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh.  22-25  Augii.st  1994 
©  7995  10 P  Publishing  Ltd 


Free  Space  Holographically  Interconnected  Counter 


R.J.  Feuerstein^  D.C.  O’Brien,  A.  Fedor,  C.C.  Mao,  L.H.  Ji 

Optoelectronic  Computing  Systems  Center,  University  of  Colorado  Boulder, 
CO  80309-0525  Phone:  303-492-7077,  303-492-3674  (FAX) 


Abstract.  We  have  constructed  a  simple  4-bit  counter  using  holographic 
interconnects,  microlens  arrays,  linear  arrays  of  vertical  cavity  surface  emit¬ 
ting  lasers  (VCSELs),  and  a  CMOS  detector  array.  It  was  packaged  using 
Spindler  and  Hoyer  microbench  components.  Design  and  performance  of  this 
prototype  system  are  discussed. 


1.  Introduction 

We  are  working  on  various  architectures  for  three-dimensional  optoelectronic  computers 
[Neff].  They  all  share  free  space  holographic  interconnects  from  one  planar  optoelectronic 
processing  board  to  another.  Systems  of  this  type  avoid  the  interconnection  bandwidth 
bottleneck  of  systems  with  strictly  in-plane  electronic  interconnects.  We  will  describe 
the  results  of  an  experiment  to  construct  a  system  using  this  technology.  This  is  also  the 
first  step  in  the  development  of  a  testbed  for  evaluation  of  source  and  detector  arrays, 
holograms  and  the  necessary  optomechanics. 


2.  Overview 

Figure  1  is  a  schematic  of  the  experiment.  The  system  implements  a  simple  4-bit  counter 
using  two  optical  assemblies  with  associated  electronics.  The  optical  assemblies  are 
designed  to  demonstrate  both  fan-out  and  fan-in  operations.  The  upper  arm  has  8  inputs 
which  fan-in  to  4  outputs,  performing  an  optical  OR  operation  ,  and  the  lower  fans-out 
4  inputs  to  8  outputs.  The  electronics  perform  the  Boolean  expressions  required  for  the 
counter,  with  the  right  half  of  the  system  counting  the  odd  numbers  and  the  left  the  even. 
On  each  clock,  or  increment  signal  the  left  counts  1,  3,  5,  7,  9,  11,  13,  15,  1...  and  the 
right  0,  2,  4,  6,  8,  10,  12,  14,  0...etc.  In  this  paper  we  describe  the  optical  system.  There 
are  two  separate  optical  arms,  with  8  and  4  input  channels  respectively.  VCSEL  arrays 
are  used  as  the  input  sources,  and  the  light  from  these  is  collimated  using  a  microlens 
array.  The  collimated  beams  illuminate  an  array  of  Binary  phase  Computer  Generated 
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Holograms  (CGHs),  one  for  each  input  channel.  The  light  from  the  holograms  is  then 
Fourier  transformed  with  an  achromat,  and  an  array  of  photodetectors  is  placed  at  the 
Fourier  plane  to  detect  the  output  from  the  system.  The  components  are  mounted  with 
standard  Spindler  and  Hoyer  microbench  parts.  Both  the  source  and  receiver  assemblies 
are  mounted  on  x-y  adjustable  mounts  to  allow  for  alignment  of  the  light  beams  with 
the  detector  pads.  The  complete  optical  path,  using  a  10mm  focal  length  achromatic 
transform  lens,  is  less  than  2cm  long. 


3.  System  Design 

The  source  assembly  consists  of  the  VCSEL  array,  microlens  and  holograms  (Figure 
2).  The  VCSELs  are  arranged  in  a  linear  array  on  a  250^m  pitch  .  They  operate  at 
approximately  850nm  and  emit  a  close  to  Gaussian  beam  with  a  (e“^)  half-angle  of  0.1 
radians.  The  peak  output  power  is  approximately  l.SmkF  {V  =  5.5V"  I  =  9.0mA).  The 
beams  are  then  collimated  using  a  commercially  available  microlens  array,  which  has  a 
focal  length  of  560/im  and  a  250^m  pitch.  A  spacer  is  glued  to  the  VCSEL  surface  so 
each  VCSEL  lies  at  the  focus  of  a  microlens  when  the  array  is  glued  to  the  spacer.  The 
lens  array  is  then  aligned  with  the  VCSEL  array  using  an  active  technique:  The  VCSELs 
are  turned  on  and  the  output  beams  monitored  with  a  CCD  camera.  The  lens  array  is 
placed  on  the  spacer,  then  translated  until  the  best  optical  field  is  obtained.  A  UV  curing 
epoxy  is  used  to  attach  the  array.  The  hologram  array  is  spaced  approximately  300^m 
from  the  microlens  array  using  a  spacer  ring  glued  to  the  VCSEL  package  (this  allows 
reuse  of  the  collimated  array). 

There  are  two  spatially  variant  CGHAs,  one  with  4  holograms  and  the  other  with 
8.  The  first  CGHA  performs  fan-out  of  a  single  input  beam  to  two  output  spots  while 
the  latter  performs  fan-in  of  two  beams  to  one  spot  (Figure  3).  The  holograms  were 
encoded  using  a  simulated  annealing  algorithm  [Feldman,  Yoshikawa]  .  The  simulated 
annealing  algorithm  was  used  to  encode  64  x  64  pixel  images  which  were  replicated  to 
create  a  128  x  128  array.  We  used  an  optical  lithography  system  to  fabricate  a  master 
set  of  holograms  with  a  resolution  of  2.2fim.  The  holograms  are  then  copied  into  a  phase 
medium  by  contact  copying  the  master  holograms  onto  a  quartz  substrate  and  etching 
using  a  reactive-ion  etching  system  [Zhang].  This  produces  a  two-level  phase  hologram 
with  a  theoretical  best  efficiency  of  40%.  Measured  efficiencies  of  these  CGHAs  varied 
from  20  —  26%.  This  difference  may  be  due  to  roughness  of  the  quartz  where  it  is  etched. 
The  quartz  CGHA  is  aligned  to  the  VCSEL  beams  using  an  active  alignment  technique 
similar  to  that  described  above.  Eigure  4  shows  the  measured  spot  pattern  for  the  fan-in 
CGHA  when  four  of  the  eight  lasers  are  on. 

A  2  X  8  photodetector  array  was  used  to  detect  the  output  from  the  system.  This 
was  designed  locally  and  fabricated  in  2/im  CMOS  by  the  MOSIS  fabrication  service. 
The  array  consists  of  a  high  sensitivity  1  x  8  array  and  a  1  x  8  array  with  lower  sensitivity. 
Signals  from  the  detectors  are  amplified  and  thresholded  to  provide  TTL  level  outputs 
from  the  detector  IC  [Mao].  The  high  sensitivity  array  was  used  to  form  the  output 
of  the  system  and  required  >  40mkF  of  optical  power  to  generate  a  TTL  level  output 
pulse.  The  detectors  are  200  x  200/im^  on  a  250/im  pitch.  A  typical  light  spot  size  at 
the  detector  for  this  system  is  approximately  lOO/zm,  so  the  large  detector  area  allows 
for  shifting  of  the  spots  due  to  wavelength  variations,  as  well  as  minor  alignment  errors 
and  distortions  caused  by  the  optical  system. 
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Figure  1.  Overview  of  complete 
holographically  interconnected  counter. 
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Figure  3.  Hologram  design  patterns  for 
fan-out  and  fan-in. 


Figure  4.  Measured  spot  pattern  of  the  Fan-in 
CXjHA  with  four  lasers  on.  Note  both  +1  and  -1 
order  as  well  as  0  order  are  visible. 


Figure  2.  Packaging  of  one  of  the  VCSEL  arrays 
with  the  microlens  and  hologram  arrays. _ 
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Figure  5.  Oscilloscope  trace  of  the  four  bit  output  counter 
for  even  numbers.  Bit_0  is  not  shown  as  it  is  0  all  the 
time.  The  upper  trace  is  the  clock  or  increment  signal  for 
the  counter.  The  next  three  traces  are  the  count  bits. 
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4.  System  operation 

The  system  speed  is  limited  by  the  minimum  power  required  for  the  receiver  chips  to 
generate  a  TTL  level  output  pulse.  This  in  turn  depends  on  the  efficiency  of  the  holo¬ 
grams,  and  the  VCSEL  output  powers.  Figure  5  shows  the  trace  of  the  four  bit  output 
counter  for  even  numbers.  Bit  0  is  not  shown  as  it  is  0  all  the  time.  The  upper  trace  is 
the  clock  or  increment  signal  for  the  counter.  One  can  see  all  even  numbers  shown  for 
this  100  kHz  clock  rate.  This  was  the  highest  speed  achievable  for  stable  operation  of 
the  counter.  There  were  a  number  of  VCSELs  whose  power  dropped  so  much  that  they 
were  no  longer  able  to  trigger  the  receiver  chip.  The  appropriate  signals  were  hard  wired 
electronically  to  get  around  this  problem.  The  system  is  quite  stable,  probably  due  to 
its  large  beam  position  tolerance. 


5.  Conclusions 

We  have  demonstrated  a  simple  3-D  holographically  interconnected  counter  using  VC¬ 
SELs.  The  system  can  be  used  for  further  evaluations  of  design  algorithms  and  fab¬ 
rication  technologies  for  CGHAs,  as  well  as  receiver  arrays  and  VCSELs  by  simply 
interchanging  the  appropriate  parts.  This  can  be  easily  accomplished  since  standard 
optomechanical  mounts  and  electronic  sockets  are  used  throughout.  This  valuable  first 
step  exposed  many  of  the  issues  involved  in  the  design,  packaging,  alignment  and  assem¬ 
bly  of  a  complete  3D  system. 
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Abstract.  An  optical  image  processor  with  optical  feedback  and  gray  scale 
capability  compactly  organized  around  bistable  binary  amorphous  silicon  and 
ferroelectric  liquid  crystal  devices  and  a  one  lens  correlator  using  reflective 
multiplexed  photopolymer  hologram  is  briefly  presented.  First  experiments 
relative  to  the  interconnection  holograms  are  reported. 


1.  introduction 

A  broad  class  of  efflcient  and  widely  used  low  level  image  processing  operations  are  the 
non-linear  filters  such  as  ranked-order  filtering  and  morphological  operations.  They  can 
be  performed  on  binary  images  by  convolving  them  with  a  binary  or  gray  scale  kernel 
and  thresholding  the  output.  Extension  to  gray  scale  images  is  performed  by  considering 
them  as  a  finite  collection  of  binary  slices,  processing  each  slice  as  previously  and  finally 
summing  the  processed  slices  to  obtain  the  processed  gray  scale  image.  [1]. 


2.  Proposed  architecture 

Optically  addressed  spatial  light  modulators  (OASLM)  using  Ferroelectric  Liquid  Crystal 
(FLC)  are  considered  here.  VLSI  backplane  OASLM  [2]  could  also  be  used.  The  device 
is  a  mosaic  of  nine  independently  driven  aSi:H/FLC  bistable  OASLMs  with  fixed  high 
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Figure  1.  The  processor  architecture 


threshold  [3]  (  250  pW'  jmiiT  at  633  nrn  and  2.5  kHz  frame  rate)  realized  on  the  same  2” 
X  2"  optical  flats  .  It  is  illuminated  on  both  sides  through  the  polarizing  beam  splitters 
(see  Figure  1). 

The  aSi:H/FLC  interface  is  reflective  (pixelated  aluminium).  The  central  OASLM 
faces  the  opposite  direction  compared  to  the  peripheral  ones.  The  right  side  performs  the 
feedback  by  using  a  80  mm  focal  length  symmetrical  lens  ,  a  multiplexed  reflective  phase 
hologram  and  independently  controlled  feedback  laser  diodes  (1  mW,  670  nm).  The 
left  side  performs  correlation  from  the  central  OASLM  to  the  peripheral  one’s  by  using  a 
power  source  (  10  mVV  -  633  nm  IleNe  Laser)  a  variable  light  biasing  for  threshold  control 
and  erasing  (equally  1  mW  -  670  nm  laser  diodes)  and  a  lens  identical  to  the  right  one  . 
The  multiplexed  reflective  holograms  recorded  at  514  nm  on  a  separate  bench  use  recent 
Du  Pont  photopolymer  [4]  materials  and  color  tuning  Aims  [5]  to  shift  replay  wavelength 
to  633  nm  for  left  hologram  or  670  nm  for  right  hologram  .  The  gray  level  input  256  x 
256  pixels  image  in  the  input  plane  is  formed  by  an  input  leirs  on  the  8x8  mm  useful 
aperture  of  the  central  OASLA4.  The  biasing  light  from  the  central  right  thresholding  / 
erasing  diode  through  the  right  hologram  is  incoherently  added  to  the  image  to  determine 
the  threshold  level  and  the  result  thresholded  and  stored  as  a  binary  image  in  the  FLC. 
Using  the  power  source  this  slice  is  convolved  simultaneously  with  the  eight  kernels 
recorded  in  hologram  chosen  for  the  application.  One  or  a  few  of  the  convolved  images 
are  stored  on  a  peripheral  OASLM  while  the  previously  stored  images  on  the  other 
OASLMs  are  retained.  The  other  slices  of  the  images  are  successively  formed,  processed 
in  the  same  way  and  stored  in  peripheral  OASLMs.  The  final  summing  operation  is 
performed  by  illuminating  simultaneously  with  the  feed-back  diodes  the  OASLMs  where 
the  previously  processed  slices  are  stored.  The  input  image  is  obstructed  using  an  input 
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shutter.  The  gray  level  output  image  is  directed  onto  the  Output  plane  by  activating  an 
FLC  quarter  waveplate  which  rotates  the  polarization  of  the  reflected  beam  by  ideally 
90  deg..  Binary  image  processing  can  be  performed  in  the  same  way  but  additionally  in 
that  case  operations  are  easily  cascaded  by  using  the  feedback  diodes. 


3.  Holograms 

3.1.  Photoplolymer  hologram 

Reducing  the  power  of  the  main  optical  source  requires  to  use  high  reflection  efficiency 
hologram.  So  phase  holograms  are  used  and  photopolymer  material  are  chosen  because 
of  their  low  cost,  very  simple  process,  performances  on  replay  and  stability  over  time.  A 
drawback  of  these  material  is  that  they  may  not  be  recorded  in-situ  in  practice  because  of 
their  too  low  sensitivity  (15  to  30  mJ/cm^  ).  The  process  consists  just  a  1  min.  UV  light 
exposure  and  a  hour  baking  at  100  Celsius  deg..  During  the  baking,  the  index  wave 
in  material  slightly  shrinks  (  about  2%  ).  The  photopolymer  material  is  available  on  non 
rigid  mylar  with  a  removeable  cover  sheet.  The  tacky  photopolymer  is  laminated  on  a 
2”  X  2”  glass  substrate  to  be  easily  handled.  Some  experiments  have  been  performed 
to  measure  reflection  efficiency  versus  external  incidence  of  a  quasi  unslanted  grating 
recorded  in  HRF  600  material  20  gm  thickness  from  Du  Pont. 

The  recording  was  performed  using  a  Argon  Ion  laser  at  514  nm  with  60  j arP  on 
each  beam  (6  mW/cm^  during  10  s).  The  hologram  is  rotated  from  the  bissector  plane  of 
the  two  beams  by  the  angle  required  to  achieve  a  1.5  deg.  external  reflection  angle.  The 
beam  angle  is  150  deg.;  it  has  been  computed  using  a  Kogelnik  based  program  so  that 
assuming  a  2  %  shrinking  during  the  heat  treatment,  the  maximum  reflection  efficiency 
at  the  same  wavelength  occured  for  a  normal  incidence  with  the  external  reflection  angle 
of  1.5  deg..  This  program  also  computes  how  much  must  the  hologram  should  be  rotated 
from  the  bissector  plane.  An  external  reflection  efficiency  more  than  85  %  is  obtained 
around  normal  incidence  so  the  expected  high  reflection  efficiency  and  shrink  compen¬ 
sation  is  verified.  Taking  into  account  the  normally  reflected  amount  of  power  on  the 
hologram  surface  (4%),  internal  reflection  efficiency  of  90%  can  be  estimated  . 

3.2.  Hologram  multiplexing 

Sequential  exposure  multiplexing  has  been  first  chosen  because  it  requires  only  simple 
optical  set-up.  The  recording  optical  set-up  is  the  same  as  discussed  above,  the  hologram 
is  just  rotated  by  45  deg.  around  its  axis  between  two  successive  exposures.  Considering 
two  gratings  engraved  the  gratings  vectors  are  labelled  for  first  exposure  Ki  and  second 
K2  .  Observing  the  reflected  beams,  the  maiiii  beam  reflected  l)y  Ki  is  only  24%  of 
incident  and  rnain2  beam  reflected  by  K2  is  12%,  the  reflection  angle  d  is  2  deg.. 

Although  the  exposure  times  (2.5  s)  were  equal  for  both  gratings,  the  firstly 
recorded  give  twice  more  reflection  efficiency  than  the  second  one.  This  is  due  to  a 
non  uniform  photopolymerisation  of  monomer  during  the  time  [6,  7].  Brightness  bal¬ 
ancing  could  l)e  reached  by  increasing  exposure  time  from  first  to  last  records.  The 
main  part  of  the  remaining  energy  is  located  on  two  parasitic  beams  that  are  in  the 
same  plane  as  the  main  ones.  The  maiii2  reflected  beam  is  also  reflected  by  the  Ki 
grating  that  is  not  off  Bragg  enougli.  Then  this  beam  traveling  to  the  back  side  of  the 
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material  is  reflected  all  along  the  travel  by  K2  for  the  same  reason  as  previously  giv¬ 
ing  the  cross  ~  reflectedu  beam.  The  same  process  occurs  for  the  m.aini  beam  giving 
cross  —  reflected2i.  The  cross  —  reflectedi2  beam  is  as  strong  as  main2  (12%  reflection 
efficiency)  and  cross  —  rcflected2\  beam  is  4%  Other  very  weak  beams  can  be  observed 
and  depicted  as  back-reflected,  2nd  order,  cross- cross-reflected,  back-back-reflected. 

The  1st  order  cross-reflected  beams  must  be  lowered  as  much  as  possible.  Qualita¬ 
tively  this  can  be  done  by  using  a  large  8  angle  and  increasing  the  material  thickness.  The 
analytical  Kogelnik  theory  can’t  be  used  to  model  the  coupling  between  a  main  reflected 
beam  and  its  adjacent  gratings  vector  because  the  theory  is  not  valid  when  grating  vec¬ 
tor  is  not  lying  in  the  incidence  plane.  The  three  dimensional  numerical  theory  [8]  from 
Moharam  and  al  must  be  used.  It  describes  rigorously  this  problem  of  conical  diffraction. 
Some  experiments  not  reproduced  here  have  been  done  by  using  various  8  angle  10,  20 
and  30  deg.  .  The  angle  must  be  as  high  as  possible  and  not  below  10  deg..  15  deg.  is  a 
good  compromise  between  this  constraint,  the  compactness  and  feasibility  of  lens. 


4.  Conclusion 

An  ardiitecture  for  an  image  processor  based  on  a  symmetrical  multichannel  looped 
correlator  using  photopolymer,  a-Si:H  and  FLC  is  presented  with  possible  dimensions. 
In  addition  to  iterative  binary  image  processing,  gray-scale  processing  using  threshold 
decomposition  with  possibly  gray-scale  different  slice-kernels  is  shown  to  be  easily  im- 
plementable.  Use  of  Du  Pont  photopolymer  materials  to  produce  the  required  reflective 
multiplexed  phase  holograms  is  experimented.  The  processor  may  be  feasible  with  low 
cost  and  available  devices. 
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Abstract.  We  describe  the  design,  fabrication  and  testing  of  an  optical  interconnect  system 
constructed  from  a  two  dimensional  array  of  symmetric  self-electro-optic  effect  devices  and  a  free 
space  perfect  shuffle  module. 


1.  Introduction 

The  perfect  shuffle  has  been  proposed  as  an  efficient  interconnect  that  optics  can  do  well[l]. 
However,  to  our  knowledge,  no  free  space  optical  system  has  been  constructed  that  tests  the 
operation  of  a  shuffle  between  optical  logic  elements.  Most  previous  work  has  focused  on  the 
development  of  the  perfect  shuffle  optics  without  regard  to  interfacing  to  the  optical  logic 
elements. 

An  optical  system  designed  to  be  used  to  implement  sequential  logical  processing  has 
been  fabricated  and  used  to  demonstrate  transfer  of  optically  encoded  data  from  an  optical  logic 
array  though  a  perfect  shuffle  interconnect  and  back  onto  the  logic  array.  An  array  of  self¬ 
electrooptic  effect  devices  (S-SEEDs)  [2]  serve  as  the  logic  array  and  the  perfect  shuffle  is 
similar  to  that  developed  at  Heriot-Watt  University [3]. 

2.  System  Design 

The  basic  architecture  is  shown  in  Figure  1.  The  input,  output  and  power  supply  optics  are  not 
shown.  The  primary  optical  circuit  is  implemented  in  a  loop.  The  S-SEED  array  is  divided  up 
so  that  each  column  corresponds  to  stage  of  the  interconnect.  The  output  of  the  S-SEED  array 
is  routed  through  a  one-dimensional  perfect  shuffle  module  and  then  imaged  back  onto  the  S- 
SEED  array  displaced  by  one  column.  This  produces  a  multistage  interconnect  with  perfect 
shuffles  between  each  stage. 

The  S-SEED  is  arranged  to  so  that  these  are  used  as  8  columns  of  16  devices  each.  This 
size  was  selected  to  match  the  field  of  a  commercial  laser  objective  used  to  image  the  S-SEED. 
Spots  to  read  out  the  array  are  generated  by  two  lasers,  one  for  the  even  columns  and  one  for 
the  odd  columns.  The  necessary  spot  pattern  is  generated  by  a  binary  phase  diffractive  element. 

The  perfect  shuffle  employs  the  basic  split-and-shift  approach  to  implement  a  theoretically 
lossless  interconnect[3].  The  other  major  advantages  are  that  it  can  be  constructed  from  readily 
available  components,  it  could  be  aligned  in  stages,  it  has  identical  loss  (theoretically  zero)  for 
all  the  beams  through  the  system,  it  could  be  enhanced  to  provide  internal  fanout,  and  would 
work  with  data  in  differential  representation. 

A  schematic  of  the  hardware  used  for  the  perfect  shuffle  is  shown  in  Figure  2.  The 
collimated  p  polarized  beam  is  transmitted  through  the  polarizing  beam  splitter  (PBS)  and 
quarter  wave  plate  (QWP)  at  the  upper  left  of  the  figure.  The  transform  lens  focuses  the  beam. 
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and  the  focal  plane  array  is  split  into  two  halves  with  a  knife  edge  mirror  placed  perpendicular 
to  the  direction  of  propagation.  Half  of  the  array  is  reflected,  the  other  half  passes  by  the  edge 
and  is  recollimated  and  transmitted  through  another  QWP.  The  beams  are  directed  using  total 
internal  reflection  in  a  45“  prism  so  that  beams  from  both  portions  of  the  array  are  on  the  same 
axis  but  propagating  in  opposite  directions.  Transform  lenses  are  used  to  focus  each  of  the 
half-arrays  onto  opposite  sides  of  a  second  mirror  patterned  with  stripes,  at  which  plane  the 
arrays  are  interlaced  and  then  exit  the  module  in  the  same  polarization  at  the  lower  right  PBS. 

SEED  array 


Split-and-shift 


Quarter  Wave 
Plate 


Figure  2.  Schematic  of  hardw^  used  in  perfect  shuffle  interconnectiOT  module. 


The  design  of  the  shuffle  module  incorporated  a  factor  of  V2  magnification  in  the  imaging 
operation.  The  reason  for  this  is  due  to  the  2: 1  magnification  that  occurs  between  input  and 
output  array  pitch  in  one  dimension  when  unity  magnification  is  used.  It  is  considerably  easier 
to  design  an  anamorphic  telescope  that  will  magnify  the  ratio  of  the  object  sides  by  a  factor  of 
two,  rather  than  strictly  2:1  magnification.  Hence,  a  magnification  of  V2:l/>/2  used  in  the 
anamorphic  telescope  determined  the  relay  lenses  that  were  used  in  the  shuffle  implementation. 
The  anamorphic  telescope  used  compound  lenses  constructed  from  a  cylindrical  lens  and 
spherical  lens.  The  two  compound  lenses  are  then  oriented  with  cylinder  axes  perpendicular. 
The  configuration  of  the  anamorphic  telescope  was  designed  to  have  telecentric  input  and 
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output,  i.e.  the  pupils  of  the  spherical/cylinder  lens  combination  and  the  single  spherical  lens 
were  coincident  The  alignment  task  is  considerably  easier  than  if  a  non-telecentric  design  had 
been  used. 

The  input  to  the  cascaded  of  data  is  provided  by  an  electrically  addressed  linear  array  of 
surface  emitting  lasers.  It  is  used  to  set  the  first  column  of  S-SEEDs.  The  VCSEL  array  is 
directly  driven  with  an  open-collector  TTL  circuit  that  is  also  used  to  sequence  the  column  and 
programming  lasers. 

Figure  3  shows  a  schematic  of  the  basic  system.  Light  from  the  even  and  old  column 
laser  is  combined  and  then  the  array  of  spots  is  generated.  These  spots  are  then  directed 
through  a  beam  combining  system  and  focused  onto  the  S-SEED  array.  The  reflected  light 
from  the  S-SEED  is  shuffled  and  then  has  the  spot  pitch  corrected  by  an  anamorphic  telescope 
before  being  combined  and  imaged  onto  the  S-SEED  array.  The  entire  system  is  doubly 
telecentric  for  improved  alignment  tolerance,  except  at  the  intermediate  focus  of  the  anamorphic 
telescope.  The  system  was  constructed  using  the  milled  baseplate  approach[4]. 


3.  Operation 

The  system  has  been  assembled  and  aligned  and  demonstrated  cascaded  data  transfer  with 
100%  correct  operation  with  a  switching  time  of  1.5  ps.  Figure  4  shows  a  computer  enhanced 
picture  of  the  light  reflected  from  the  S-SEED  plane.  The  system  is  being  clocked  at  lOOkHz 
and  the  data  from  the  first  column  of  S-SEED  devices  is  cascaded  through  the  succeeding 
columns.  Notice  that  because  S-SEEDs  are  differential  devices,  each  column  is  made  up  of  two 
spots.  The  first  column  was  set  with  the  upper  eight  devices  in  one  state  and  the  lower  eight  in 
the  other.  This  light  is  shuffled  in  one  dimension  and  cascaded  to  the  second  column.  The  light 
reflected  from  the  second  column  of  S-SEEDs  shows  the  results  of  the  the  shuffle  and  the 
invertion  operation  of  the  S-SEED.  Light  from  the  third  and  subsequent  columns  are  cascaded 
and  inverted  in  a  similar  manner.  Note  that  after  4  (=  log2l6)  cascades  the  data  has  been 
returned  to  the  original  configuration. 

For  this  system  the  key  limiting  factor  for  the  system  speed  was  not  variation  in  cascaded 
beam  powers[3],  but  rather  crosstalk  between  the  readout  and  cascaded  arrays  of  beams.  The 
readout  light  was  generated  with  a  binary  phase  grating  designed  to  illuminate  alternate 
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columns.  However,  some  light  from  orders  that  were  supressed  still  reached  the  S-SEED 
plane.  These  orders  were  aligned  with  die  other  columns  of  S-SEEDs;  the  columns  where  the 


Figure  4.  Computer  enhanced  image  of  the  focal  spot  array  on  the  device  plane,  showing 
leftmost  column  shuffled  at  each  subsequent  column. 

4.  Conclusions 

In  conclusion,  we  have  demonstrated  the  first,  to  our  knowledge,  free  space  optical  system  that 
cascades  optical  data  through  a  perfect  shuffle  between  logic  elements.  We  have  shown  its 
100%  correct  operation  and  found  the  operating  speed  to  be  limited  by  crosstalk  on  the  array. 
This  system  demonstrates  that  extensive  free  space  optical  systems  can  be  constructed,  aligned 
and  demonstrated.  However,  difficulty  cascading  back  to  the  same  image  plane  suggests  that 
interconnect  architectures  that  can  be  more  easily  implemented,  such  as  the  butterfly  or  Banyan, 
are  preferable  to  the  perfect  shuffle  for  free  space  applications. 
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Abstract.  A  parallel  sorting  algorithm  and  its  implementation  on  a  3-D  multistage  optoelectronic 
sorting  network  are  presented.  The  network  operates  on  bit-slices  of  words  and  combines  regular 
structure  and  simple  interconnections  between  stages. 


1.  Introduction 

One  of  the  advantages  of  parallel  optical  storage  systems,  such  as  volume  holographic  memo¬ 
ries,  is  the  ability  to  store  and  retrieve  two-dimensional  blocks  of  data  in  parallel.  Any  attempt 
to  process  these  blocks  in  a  traditional  fashion  would  result  in  bottlenecks  between  the  memory 
and  the  processing  units  of  the  electronic  computer.  To  fully  exploit  the  inherent  parallelism 
of  optical  memory,  we  must  design  new  processing  architectures  capable  of  handling  large 
blocks  of  data  in  parallel.  Such  architectures  are  expected  to  be  based  on  multiple  stages  of  fine- 
grain  optoelectronic  processing  elements  [1]. 

Sorting  is  a  common  operation  in  a  large  range  of  applications  and  can  be  extremely  time- 
consuming  for  long  lists  of  words  [2].  In  this  paper  we  present  tomographic  sorting,  a  parallel 
pipelined  algorithm  that  can  be  mapped  directly  on  a  three-dimensional  computer  architecture 
that  uses  optical  interconnections  to  propagate  data  and  control  bits  from  one  processing  plane 
to  another.  The  algorithm  is  a  combination  of  the  odd-even  transposition  sort  and  the  standard 
radix  sort  [2],  We  begin  our  discussion  by  describing  the  algorithm  and  the  structure  and  func¬ 
tionality  of  the  tomographic  sorting  network.  Then  we  discuss  how  this  network  can  be  imple¬ 
mented  as  a  multistage  architecture  similar  to  the  three-dimensional  optoelectronic  computer 
under  development  at  the  Optoelectronic  Computing  Systems  Center  in  Colorado.  Tomograph¬ 
ic  sorting  can  serve  as  a  test  application  for  this  computer. 

2.  Sorting  Network 

The  tomographic  sorting  network  operates  on  a  list  of  N  words  with  M  bits  each  and  has  M  pro¬ 
cessing  stages.  A  two-dimensional  version  of  the  network  is  shown  in  Figure  2.  The  TV  words 
are  stored  as  M  bit  slices  (thus  the  term  tomographic)  with  the  k-th  slice  containing  the  k-th  sig¬ 
nificant  bit  of  every  word.  Ail  the  bit  slices  are  pipelined  into  the  network,  and  remain  there 
for  a  fixed  number  of  clock  cycles  until  they  are  sorted  and  ready  to  be  transferred  out  and  into 
a  buffer.  Sorting  is  performed  by  rearranging  bits  on  the  same  slice  so  that,  eventually,  words 
are  reshuffled  and  end  up  in  ascending  order  from  top  to  bottom.  After  the  bit  slices  are  loaded 
into  the  sorting  unit,  only  control  signals  are  transmitted  from  one  stage  to  another  to  indicate 
any  necessary  bit  exchanges. 

Two  types  of  cells  are  present  in  the  tomographic  sorting  network:  data  and  control  cells. 
Each  cell  contains  a  flip-flop  to  hold  the  value  of  a  bit.  Rows  of  data  cells  are  interleaved  with 
rows  of  control  cells  as  shown  in  Figure  2.  The  control  cells  are  basically  bitwise  compare-and- 
exchange  modules  with  some  additional  control  lines.  Optical  signals  propagate  along  the  hori¬ 
zontal  direction,  while  electronic  signals  traverse  the  vertical  direction. 
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Figure  1 .  Tomographic  Sorting  Network  in  two-dimensions. 
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A  column  of  data  and  control  cells  constitutes  a  stage.  The  data  cells  in  every  stage  will 
receive  and  hold  the  contents  of  the  corresponding  bit  slice.  Notice  that  all  signals  flow  from 
left  to  right  and  that  the  stages  are  numbered  in  descending  order  fromM  to  1  (from  left  to  right) 
to  coincide  with  the  bit  order  in  a  word.  The  number  of  data  cells  on  each  stage  is  equal  to  N, 
while  the  number  of  control  cells  isN-\.  The  number  of  stages  that  are  required  is  equal  to  the 
length  of  each  word  in  bits.  The  interconnection  pattern  between  any  pair  of  stages  is  a  straight¬ 
forward  one-to-one  mapping.  Another  characteristic  of  this  network  is  its  regular  structure, 
which  also  contributes  to  a  simple  implementation  and  permits  easy  scaling  up  to  accommodate 
larger  sets  of  words. 

3.  Algorithm  Description 

As  mentioned  earlier,  the  sorting  algorithm  is  a  combination  of  the  odd/even  transposition  and 
the  radix  sort  algorithms.  The  process  is  initiated  by  looking  at  the  most  significant  bit  slice  and 
trying  to  partition  it  into  two  groups:  the  0-group  at  the  top  and  the  1  -group  at  the  bottom.  Every 
pair  of  adjacent  bits  that  are  not  equal  allows  a  decision  on  the  relative  magnitude  of  two  words 
to  be  reached.  If  the  two  words  must  be  exchanged,  the  process  is  completed  in  a  pipelined  fash¬ 
ion  with  an  unconditional  exchange  control  signal  gradually  forcing  all  the  lower  significance 
bits  of  the  two  words  to  exchange  positions.  If  a  decision  cannot  be  reached  at  the  MSB  slice 
because  the  two  bits  under  consideration  are  equal,  the  decision  is  deferred  to  the  next  most  sig¬ 
nificant  bit  slice  by  sending  a  conditional  exchange  control  signal  to  the  next  stage.  This  will 
force  the  next  control  cell  to  perform  a  bit  comparison  to  determine  the  relative  magnitude  of 
the  two  words.  While  an  unconditional  or  conditional  exchange  signal  ripples  through  all  the 
stages  of  the  network,  a  new  set  of  tests  can  be  initiated  at  the  MSB  slice.  The  outcome  of  these 
tests  and  the  subsequent  execution  of  any  action  required  will  not  affect  the  completion  of  the 
previously  initiated  exchanges  that  may  still  continue  to  occur  towards  the  final  stages  of  the 
pipeline. 

Three  distinct  phases  are  present  in  the  tomographic  sorting  operation;  (i)  data  loading, 
during  which  the  bit-slices  are  pipelined  into  the  sorting  stages;  (ii)  sorting,  and  (iii)  data  unload- 


61 


ing,  during  which  the  bit  slices  of  sorted  words  exit  the  pipe.  During  the  first  phase,  bit-slices 
are  pipelined  into  the  sorting  unit  starting  with  column  1 .  In  one  cycle,  an  entire  column  is  trans¬ 
ferred  in  parallel  from  stage  k  to  stage  k-\.  After  M  cycles,  all  the  M  bit  slices  will  be  loaded 
into  the  data  cells  of  the  sorting  unit.  At  the  second  phase,  the  operation  begins  by  examining 
pairs  of  bits  in  the  control  cells  of  the  MSB  bit  slice  in  a  way  similar  to  that  in  the  bubble  sort 
algorithm.  If  two  bits  are  found  to  be  out  of  order,  they  are  exchanged  and  a  forced  exchange 
control  signal  is  generated  and  pipelined  to  the  remaining  stages.  Since  the  words  are  sorted 
in  ascending  order  from  top  to  bottom,  the  combination  1,0  in  a  pair  of  bits  will  initiate  this  pro¬ 
cess.  If  two  bits  are  equal,  no  decision  can  be  made  on  the  relative  magnitude  of  the  words,  so 
the  control  cell  will  issue  a  compare-and-exchange  signal  to  the  corresponding  control  cell  in 
the  next  stage.  If  two  bits  are  in  the  right  order  (i.e.,  0,1),  no  action  will  take  place  and  both 
control  lines  to  the  next  stage  will  be  low.  At  the  end  of  this  phase,  the  words  will  be  stored  in 
ascending  order  from  top  to  bottom,  i.e.,  the  minimum  value  will  be  located  in  the  first  row  of 
data  cells,  while  the  maximum  value  may  be  found  in  the  bottom  row  of  the  array.  Finally,  dur¬ 
ing  the  third  phase,  the  bit  slices  of  the  sorted  words  can  be  flushed  out  of  the  pipe,  one  slice 
at  a  time,  in  a  total  of  M  cycles.  Since  data  move  from  left  to  right,  the  first  and  the  last  phases 
can  overlap  when  repeated  sorting  operations  must  be  performed. 

4.  Discussion 

Despite  the  fact  that  special  purpose  numerical  sorting  processors  and  networks  have  relaxed 
the  burden  of  sorting,  when  the  size  of  these  networks  becomes  large,  the  latency  of  the  data 
input/output  process  and  the  complexity  of  the  interconnections  between  stages  become  serious 
bottlenecks  that  render  their  implementation  impractical.  An  optoelectronic  sorting  network 
with  straight  pass  optical  interconnections  for  interstage  communications,  such  as  the  one  pres¬ 
ented  in  this  paper,  lends  itself  to  a  simpler  implementation. 
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Figure  2.  (a)  Time  complexity  and  (b)  space  complexity  of  the  bitonic  sorting  network  for  M=128  versus  three 
tomographic  sorting  networks  of  different  sizes. 


Tomographic  sorting  can  be  completed  in  0{N)  time  steps.  Although  the  tomographic 
sorting  network  is  not  faster  than  the  bitonic  sorting  network,  as  shown  in  Figure  2,  it  requires 
fewer  and  simpler  stages  as  the  number  of  elements  increases.  An  optical  implementation  of 
the  bitonic  sorting  network  [3]  becomes  prohibitively  complex  when  the  list  to  be  sorted  exceeds 
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100  elements.  The  reduced  space  complexity  of  the  tomographic  sorting  network  is  its  main 
advantage  over  other  sorting  networks. 

The  tomographic  sorting  network  can  be  expanded  in  three  dimensions.  In  the  3D  imple¬ 
mentation,  the  linear  bit  slices  are  replaced  by  two-dimensional  bit  slices  stored  as  pages  in  a 
parallel  optical  memory.  At  the  end  of  the  operation,  the  words  are  arranged  in  ascending  order 
along  a  serpentine  path  from  the  upper  left  to  the  lower  right  corner  of  the  page.  The  processing 
stages  here  are  two-dimensional  arrays  where  interstage  communications  are  carried  out  opti¬ 
cally  and  intrastage  communications  are  electronic.  A  general  view  of  a  3D  tomographic  sorting 
network  based  on  transmissive  stages  is  shown  in  Figure  3.  This  design  uses  heterostructure 
phototransistors  as  the  receiving  elements  and  vertical  cavity  surface  emitting  lasers  as  light 
sources.  Data  and  control  logic  remains  electronic. 

We  are  currently  working  towards  a  VLSI  implementation  of  the  2D  version  of  the  tomo¬ 
graphic  sorting  network.  We  have  also  designed  and  are  implementing  a  recirculating  optoelec¬ 
tronic  architecture  for  tomographic  sorting  that  uses  two  processing  stages  and  combines  silicon 
detectors,  CMOS  electronics,  and  vertical  cavity  surface  emitting  laser  arrays  [4]. 


Figures.  Schematic  of  the  3-D  optoelectronic  architecture.  HPT:  Heterostructure  Phototransistor;  VCSEL. 

Vertical-Cavity  Surface-Emitting  Laser. 
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Abstract:  An  optoelectronic  implementation  of  bitonic  sorting  is  presented  which  uses  a 
recirculating  architecture  to  reduce  the  number  of  required  stages  to  two.  This  architecture 
decreases  the  mechanical  complexity  at  the  cost  of  system  throughput.  In  addition  to  the  system 
layout,  a  bitwise  compare-and-exchange  module  suitable  for  this  architecture  is  presented. 


1.  Introduction 

With  sorting  applications  varying  from  knowledgbase  and  database  manipulation  to 
telecommunications  switching,  it  appears  certain  that  efficient  and  practical  methods  of 
implementing  sorting  will  continue  to  be  important.  While  a  wide  variety  of  sorting 
algorithms  have  been  explored  for  implementation  in  both  hardware  and  software,  [1]  only 
a  few  implementations  have  been  proposed  which  utilize  the  parallelism  associated  with 
optoelectronic  processing  arrays.  [2,3]  In  both  of  these  previous  works,  the  implementation 
is  based  on  a  straight  forward  construction  of  the  bi tonic  sorting  network  shown  in  figure 
1.  Each  stage  of  the  sorting  network  is  comprised  of  latching  compare-and-exchange  (C&E) 
modules  which  perform  a  bitwise  comparison  on  two  input  words.  The  stages  are  connected 
with  optical  perfect  shuffle  interconnection  networks.  In  theory,  the  capacity  (defined  as  the 
maximum  number  of  input  words  or  channels)  of  these  sorting  networks  is  unlimited. 
However,  the  number  of  processing  stages  required  to  implement  the  bitonic  sorting 
algorithm  is  dependent  on  the  system  capacity.  Therefore,  the  realization  of  a  large  capacity 
optoelectronic  sorter  may  be  limited  by  the  mechanical  complexity  required  to  build  and 
maintain  such  a  large  multistage  network. 

Previously,  it  has  been  suggested  that  the  mechanical  complexity  of  a  multistage 
network  can  be  reduced  if  the  data  are  recirculated  through  the  same  hardware.  [4]  In  the 
early  seventies.  Stone  [5]  suggested  a  recirculating  architecture  for  bitonic  sorting.  This 
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Bitonic  sorting  network  comprised  of  compare-and-exchange  processing  stages 
interconnected  with  a  perfect  shuffle  interconnection. 
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Figwe  2  Schematic  layout  of  a  recirculating  bitonic  sorter  using  only  two  stages  and  two  perf^t 

shuffle  interconnections. 

recirculating  architecture  utilized  an  N  element  array  of  latches  connected  via  a  perfect 
shuffle  interconnection  network  to  an  N/2  element  array  of  C&E  modules.  The  outputs  of 
the  C&E  modules  were  then  connected  in  a  one  to  one  fashion  to  the  latch  inputs.  In  this 
paper  we  propose  an  alternative  recirculating  architecture  which  utilizes  arrays  of 
optoelectronic  bitwise  C&E  modules. 

2,  C&E  modules  and  System  Layout 

Figure  2  shows  the  system  layout  of  the  proposed  optoelectronic  recirculating  sorter.  It  is 
comprised  of  two  stages  positioned  on  opposite  sides  of  a  beam  splitter.  This  arrangement 
is  compact  and  the  beam  splitter  provides  a  means  for  convenient  optical  input  and  output. 
Each  stage  of  the  recirculating  system  consists  of  C&E  modules  which  compare  adjacent 
elements  of  the  input  set.  The  optical  outputs  of  one  stage  provide  optical  inputs  to  the  other 
stage.  The  optical  data  signals  of  each  stage  are  imaged  onto  multifaceted  holograms  using 
a  micro  lens  array.  The  multifaceted  hologram  deflects  the  optical  signals  to  form  a  perfect 
shuffle  interconnection. 

This  recirculating  architecture  requires  that  each  array  hold  all  of  the  data  bits  for 
each  of  the  words  to  be  sorted.  Thus,  the  data  are  available  for  word-wise  C&E  modules. 
However,  the  simplicity  of  bitwise  C&E  modules,  which  operate  on  pairs  of  input  bits,  is 
desirable.  In  order  to  compare  m-bit  words  with  bitwise  C&E  modules,  m-bitwise  C&E 
modules  are  electrically  chained  together  as  shown  in  figure  3a.  The  electrical  connection 
are  used  to  communicate  control  signals  such  as  the  exchange  status  (conditional  exchange, 
CE,  or  unconditional  exchange,  UE)  and  exchange  criteria  (up/down,  UD). 

Figure  3b  shows  how  the  bitwise  C&E  module  can  be  implemented  with  simple 
combinatorial  logic.  A  logical  one(zero)  on  the  UD  input  indicates  that  if  an  exchange  is 
required,  the  larger  word  should  appear  on  the  H(L)  output  lines.  A  logical  one  on  the  CE 
input  initiates  a  comparison  of  the  two  data  bits  based  on  the  exchange  criteria  established 
by  the  UD  input.  A  logic  one  on  the  UE  input  forces  an  unconditional  exchange  of  the  input 
data.  If  both  the  CE  and  UE  inputs  are  a  logic  zero  then  the  data  are  passed  through  the 
C&E  module  without  exchange.  This  state  of  the  control  signals  provides  the  functionality 
required  for  the  C&E  modules  with  no  arrows  in  figure  1 .  The  edge  triggered  latches  shown 
in  figure  3b  are  used  to  synchronize  the  flow  of  data  and  control  signals  within  the  system. 
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Figure  3 
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a)  Bit  parallel  word  comparison  using  bitwise  C&E  modules,  b)  Combinatorial  logic 
required  to  implement  bitwise  C&E  modules. 


3.  System  Capacity  and  Scalability 

The  proposed  recirculating  architecture  reduces  the  number  of  required  processing  stages  to 
two.  However,  the  system  capacity  is  reduced  unless  the  size  of  each  stage  is  increased. 
Thus,  there  is  a  hardware  trade  off  between  the  pipelined  and  recirculating  architectures. 
The  former  requires  many  stages  while  the  latter  requires  two  large  stages.  While  it  is 
possible  to  increase  system  capacity  by  utilizing  larger  arrays,  available  fabrication 
technology  will  set  a  maximum  achievable  limit  on  array  size.  In  order  to  overcome  this 
limitation  on  system  capacity,  it  is  possible  to  implement  each  stage  of  the  recirculating 
architecture  with  multiple  arrays.  However,  this  increases  the  mechanical  complexity  of  the 
system.  Figure  4  plots  the  number  of  arrays  required  to  implement  a  bitonic  sort  as  a 
function  of  maximum  array  size  (and  sorter  capacity)  for  both  architectures.  For  comparison 
it  is  assumed  that  system  capacity  is  the  same  for  both  architectures.  The  shaded  portion  of 
figure  4  indicates  the  region  where  the  recirculating  architecture  requires  fewer  arrays  than 
the  pipelined  architecture.  Note  that  for  word  lengths  shorter  than  64  bits,  the  number  of 
arrays  required  for  the  recirculating  system  is  always  less  than  the  number  required  for  the 
pipeline  system.  For  word  lengths  greater  than  64  bits,  there  is  a  region  where  the  pipeline 
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system  requires  fewer  arrays.  However,  even  for  larger  word  lengths,  the  number  of  arrays 
required  to  implement  very  large  capacity  systems  can  be  reduced  with  a  recirculating 
architecture. 

4.  System  Throughput 

As  shown  in  figure  4  the  recirculating  architecture  may  require  significantly  fewer  arrays 
than  the  pipelined  architecture.  While  this  reduction  in  hardware  components  reduces  the 
mechanical  complexity  of  the  r^irculating  architecture,  this  advantage  is  obtained  at  the 
cost  of  system  throughput,  which  is  defined  as  the  number  of  sorting  operations  performed 
per  unit  time.  For  the  pipeline  architecture,  the  most  significant  bits  of  a  data  set  follow  the 
least  significant  bits  of  the  previous  data  set.  Thus,  a  new  set  of  sorted  data  is  output  every 
m  clock  cycles.  Therefore,  the  throughput  of  a  pipeline  sorter  is  not  dependent  on  the 
system  capacity.  Conversely,  the  recirculating  architecture  cannot  load  a  new  set  of  input 
data  until  the  previous  set  of  data  has  been  sorted.  The  number  of  clock  cycles  required  to 
sort  a  set  of  data  is  dependent  on  both  the  system  capacity  and  the  word  length.  Therefore 
the  throughput  of  a  reciruclating  sorter  is  dependent  on  system  capacity.  For  a  reasonable 
capacity,  the  pipeline  system  will  have  a  larger  throughput  than  recirculating  systems.  Thus, 
a  trade  off  between  mechanical  complexity  and  system  throughput  exists. 

The  optimization  of  this  trade  off  is  certainly  application  dependent.  Since 
telecommunications  applications  are  throughput  sensitive,  the  added  mechanical  complexity 
required  to  increase  throughput  may  be  warrant^.  However,  computing  applications  such 
as  database  manipulation  may  not  be  as  strongly  dependent  on  throughput.  Tlierefore,  the 
reduction  of  mechanical  complexity  may  justify  the  corresponding  reduction  in  throughput. 
Ultimately,  the  acceptable  degree  of  mechanic^  complexity  for  a  system  will  be  limited  by 
the  cost  and  reliability  of  the  implementation.  Thus,  regardless  of  requirements  for  high 
throughput,  a  recirculating  architecture  may  be  necessary  for  sorting  very  large  data  sets. 

5.  Conclusions 

We  have  presented  an  optoelectronic  recirculating  architecture  which  decreases  the  hardware 
required  to  implement  a  bitonic  sorter.  Along  with  a  system  layout,  a  arrangement  of 
bitwise  C&E  modules  was  presented  which  performs  a  bit  parallel  word  comparison.  The 
bit  parallel  data  organization  used  in  this  recirculating  system  results  in  decreased  system 
throughput.  Thus,  the  recirculating  architecture  reduces  the  mechanical  complexity  of  the 
system  by  reducing  the  number  of  stages  required  at  the  cost  of  reducing  the  system 
throughput.  The  optimiation  of  this  tradeoff  is  dependent  on  the  specific  applicaton  of  the 
sorting  system. 
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Abstract.  We  describe  the  implementation  of  a  sigmoid  updating  probability  required  for 
Monte  Carlo  computations  and  simulated  aimealing.  It  relies  on  the  simultaneous  detection  of 
two  samples  of  the  same  speckle  field.  The  temperature  of  the  sigmoid  curve  produced  is  entirely 
controlled  by  the  mean  intensity  of  the  incident  speckle  field.  An  experimental  demonstration  is 
described  using  a  differential  pair  of  pnpn  photothyristors. 


1.  Introduction 


The  simulated  annealing  algorithm  [1]  is  one  of  the  most  widely  used  stochastic  optimisation 
techniques.  It  is  based  on  a  random  exploration  of  the  state  space.  Formally,  changing  one 
variable  of  the  energy  function  E,  resulting  in  an  energy  gradient  EE,  will  be  accepted  with  the 
sigmoid  probability 


P{^E)  = 


1 

1  +  exp(A£/r) 


(1) 


where  the  parameter  T  is  called  the  temperature  and  should  decrease  slowly  over  time 
(annealing)  to  reach  global  energy  minimisation.  We  show  that,  when  combined  with  speckle 
illumination,  a  simple  processing  element  (namely  a  differential  pair  of  pnpn  photothyristors) 
can  perform  this  stochastic  operation. 

Many  different  optoelectronic  devices  could  conceivably  be  used  in  the  elaboration  of 
this  new  technique:  SEEDs,  optically  activated  VCSEL,  etc.  We  use  pnpn  photothyristors  for 
our  demonstration.  These  devices  provide  amplification  and  allow  the  implementation  of 
cascading  operation  particularly  well-suited  for  iterative  computation.  Combined  in  a 
differential  pair,  two  pnpn  photothyristors  act  as  a  simple  maximum  detector  [2].  This  mode  of 
operation  is  obtained  by  connecting  two  photothyristors  in  parallel.  The  voltage  source  applied 
to  the  pair  is  a  three-step  voltage  sequence  corresponding  to  reset,  light  detection  and  light 
emission.  These  devices  can  be  described  simply  as  comparators:  the  photothyristor  which  has 
detected  the  highest  intensity  switches  on  and  emits  light. 


2.  Differential  detection  of  random  speckle 

Laser  speckle,  as  a  random  number  generator,  offers  great  flexibility  and  easy 
implementation  [3],  By  differential  detection  of  two  independent  speckle  samples  from  the 


Fig.  1  epical  inputs  applied  to  the  diiferential  pair. 


same  speckle  field,  a  random  signal  of  zero  mean  is  created  and  thus  an  easy  implementation  of 
a  symmetric  probability  function  is  allowed. 

The  speckle  intensity  /  incident  on  a  detector  can  be  statistically  described  with  a  gamma 
probability  function  where  the  only  parameters  are  the  mean  intensity  of  the  speckle  field  (/) 


and  the  number  of  degrees  of  freedom  M  in  the  detected  speckle  [4]: 


1  \K 

r(A/)[(/} 


with  /  >  0 


The  parameter  M  is  also  linked  to  the  variance  of  the  speckle  and  can  be  shown  to  be  the 
square  of  the  optical  signal  to  noise  ratio:  M  =  {if  j c/ .  To  take  an  intuitive  approach  in  this 
paper,  we  consider  the  gamma  probability  functions  to  be  approximately  equal  to  gaussian 
functions.  This  approximation  is  valid  when  the  number  of  degrees  of  freedom  M  is  large  (in 
practice  greater  than  10). 

In  order  to  demonstrate  the  effect  of  differential  detection,  we  suppose  a  differential  pair 
of  pnpn  photothyristors  is  illuminated  with  a  spatially  homogeneous  random  speckle  pattern. 
The  photothyristor  detecting  the  most  intense  speckle  will  switch  on.  The  switch-on  probability 
can  be  calculated  by  associating  it  to  the  speckle  probability  function.  This  is  simply  the 
probability  that  the  first  incident  speckle  intensity  I\  is  larger  than  the  second  one  li- 
p{ON)  =  /?(/,  >12)-  We  also  include  the  energy  gradient  A£  of  Eq.  1  as  an  illumination  on  the 

second  photothyristor  of  the  pair  with  an  additional  laser  beam  having  an  intensity  A£  (see 
Fig.  1).  The  switch-on  probability  for  the  first  photothyristor  is  the  probability  that  the  speckle 
intensity  received  on  the  first  photothyristor  is  greater  than  that  on  the  second  photothyristor, 
which  is  the  combined  intensity  of  the  speckle  and  of  the  laser. 

liON)  =  P{hE)  =  p(/,  >  4  +  AE)  =  -  /,  <  -AE)  (3) 


The  random  variable  I2~I\  results  from  the  difference  between  two  known  gaussian  random 
variables  having  the  same  statistical  parameters.  The  probability  distribution  of  this  new 
random  variable  is  a  gaussian  probability  distribution  of  zero  mean  and  double  variance. 
Therefore,  the  probability  that  the  photothyristor  to  which  the  additional  beam  /SE  is  applied 
will  switch  on  is 


This  is  simply  an  Erf  function.  It  is  equal  to  the  sigmoid  of  Eq.  1,  with  a  high  degree 


accuracy,  provided  that 

7’=ctV^/4  =  (/)VV4M.  (5) 


of 


The  temperature  is  found  by  matching  the  slopes  of  the  sigmoid  and  of  the  Erf  function  for 
AF;  =  0 .  Consequently,  a  differential  pair  of  photothyristors  illuminated  with  a  speckle  pattern 
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Fig.  2  Switch-on  probabilities  for  one  pnpn  of  the  differential  pair  as  a  function  of  the 
incident  laser  energy,  for  many  values  of  the  mean  speckle  field. 

should  implement  the  stochastic  updating  operation  of  Eq.  1.  Although  we  limited  ourselves  to 
gaussian  speckle  probability  functions,  we  have  analytically  shown  that  the  gamma  statistics  of 
the  speckle  allows  accurate  quasi-sigmoid  operation  with  arbitrarily  small  M  values. 


3.  Experimental  results 

3.1.  Set-up 

We  tested  the  differential  operation  described  above  with  a  differential  pair  of  pnpn 
photothyristors.  The  two  photothyristors  were  identical  squares  of  50x50  \irrfi  separated  by 
15  pm.  Two  laser  illuminations  were  used.  First,  a  speckle  pattern  created  by  the  modal  noise 
of  a  step  index  fibre  was  used  to  randomly  illuminate  the  pnpn  integrated  circuits.  The  spatial 
independence  of  the  speckle  samples  detected  by  the  two  photothyristors  is  satisfied  by  an 
appropriate  magnification  of  the  output  face  of  the  fibre.  Also,  the  successive  generation  of 
time-independent  speckles  was  ensured  by  inserting  a  rotating  diffuser  between  the  laser  source 
and  the  fibre  input,  thereby  randomising  the  phase  of  the  input  laser  beam.  As  a  second  input,  a 
laser  diode  was  imaged  onto  one  of  the  photothyristors  of  the  pair,  thus  acting  as  the  energy 
gradient  on  one  of  the  photothyristors. 

Light  emission  of  the  photothyristor  not  illuminated  by  the  laser  diode  was  recorded  with 
a  photodiode.  We  synchronised  the  detection  with  the  voltage  alimentation  of  the  pair.  In  this 
way  we  detected  the  optical  output  of  one  of  the  pnpns  and  were  able  to  calculate  its 
probability  to  switch  on  from  the  number  of  times  it  emitted  over  the  total  number  of 
observations. 

3.2.  Results 

The  experimental  results  are  shown  in  Fig.  2  and  3.  The  data  points  of  the  quasi-sigmoid  curve 
are  computed  switch-on  probabilities,  each  corresponding  to  the  average  over  5000  intensity 
measurements.  Four  probability  curves,  corresponding  to  four  different  temperatures,  are 
shown.  To  demonstrate  the  obvious  sigmoid  characteristic  of  the  response  curves,  the  sigmoid 
updating  probabilities  of  Eq.  1  were  fitted  to  the  experimental  data  points.  In  accordance  with 
Eq.  5,  these  curves  were  obtained  with  four  different  values  of  the  mean  speckle  intensity  (7), 
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Fig.  3  Temperature  of  the  sigmoid  as  a  function  of  the  mean  speckle  intensity. 

respectively  0.1  mW,  0.4  mW,  0.6  mW  and  1.4  mW.  Since  the  temperature  determines  the 
amount  of  randomness  in  the  updating  operation,  large  values  of  T  correspond  to  large  values 
of  (7)  and  the  results  correspond  perfectly  with  the  theoretical  expectations.  Furthermore,  the 
temperature  characterising  each  of  the  sigmoid  curves  is  directly  proportional,  as  expected,  to 
the  corresponding  mean  speckle  energy.  This  is  shown  in  Fig.  3  where  we  plotted  the 
experimental  temperature  as  defined  in  Eq.  5. 

We  have  noted  that  all  the  sigmoid  curves  are  shifted  to  the  right.  This  is  the  effect  of 
using  real  elements  instead  of  perfect  comparators.  The  shift  corresponds  to  the  smallest 
energy  required  for  a  correct  switch-on  operation.  This  behaviour  is  caused  by  the  intrinsic 
asymmetry  of  the  pnpn  photothyristor  pair.  It  corresponds  to  an  energy  of  about  2  pJ.  We  can 
minimise  the  effect  of  this  undesirable  phenomena  by  using  large  sigmoid  temperatures,  thus 
working  with  energy  gradients  one  order  of  magnitude  larger  than  the  shift  itself. 

An  integration  time  of  60  ps  was  used.  It  was  limited  by  the  available  laser  power  and 
not  by  the  pnpn  photothyristor  technology.  Operating  frequencies  in  the  range  of  100  kHz  are 
achievable. 


4.  Conclusion 

We  have  presented  a  simple  way  to  implement  a  stochastic  processing  element  with  sigmoid 
updating  probabilities.  We  use  speckle  to  generate  spatially  and  temporally  independent 
random  numbers  that  we  combine  with  pairs  of  pnpn  photothyristors  for  a  thresholding 
comparison.  The  result  shows  obvious  sigmoid  behaviour  for  the  switch-on  probabilities.  Thus, 
it  suggests  a  possible  use  in  a  parallel  optoelectronic  simulated  annealing  device. 
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Abstract.  Some  of  the  achievements  of  optical  computing  in  the  National  863  High 
Technology  Program  in  China  are  summarised. 


1.  Introduction 

The  National  863  High  Technology  Program,  which  was  proposed  and  drawn  up  by  scientists, 
has  been  implemented  in  China  since  March  1986.  Under  the  program  there  are  several 
groups  in  different  scientific  fields.  There  is  Optical  Computing  in  the  Group  of 
Optoelectronic  Devices  and  Integration,  which  incorporates  more  than  twenty  universities  and 
institutes  that  have  made  achievements  in  optical  computing.  The  several  areas  of  optical 
computing  are  being  implemented  as  follows. 

2.  Devices 

2.1  Investigation  of  vertical  cavity  surface  emitting  lasers  (VCSELs)  (Beijing  Institute  of 
Semiconductors).  The  optimum  structure  of  VCSEL  has  been  designed  and  fabricated  by 
MBE.  By  comparison  of  the  theoretical  with  the  experimental  results  of  reflection  spectra,  x- 
ray  double-crystal  diffraction  and  photoluminescence,  the  match  between  the  centre 
wavelength  of  the  DBR  high  reflection  band,  the  resonant  frequency  of  the  F-P  cavity  and  the 
stimulated  wavelength  of  the  quantum  well  are  realised.  By  means  of  proton  bombardment 
the  gain  waveguide  type  of  the  device  has  been  developed.  The  lowest  threshold  current 
under  pulsed  conditions  at  room  temperature  is  less  than  10  mA;  it  works  at  single-transverse 
and  single  longitudinal  mode. 

2.2  GoAs/GoAlAs  multi  quantum  well  reflectance  modulator  and  self  electro-optic  effect 
devices  (SEEDs)  (Beijing  Institute  of  Semiconductors).  The  analysis  is  emphasised  on  the 
combined  behaviour  of  the  Quantum  Confined  Stark  Effect  (QCSE),  DBR  and  the  asymmetric 
F-P  cavity  (ASFP)  effect.  Experimental  results  include;  (i)  A  fabricated  modulator  array,  with 
contrast  ratio  7db  ~  lOdb,  and  its  application  to  free-space  optical  switching  and 
interconnection,  (ii)  Demonstration  of  bistability  of  a  S-SEED  array  with  optical  switch 
energy  less  than  10ff/(pm)^. 

2.5  8  x8,  0.88  pan  LED  matrix  module  (JiLin  Institute  of  Physics).  A  8  x  8,  0.88  pm  LED 
matrix  module  of  specific  design  with  a  black  ceramic  material  base  has  been  fabricated  by  a 
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hybrid  integration  technique.  This  device  can  be  used  for  optical  interconnection  or  2D  optical 
computing. 

2.4  An  optical  addressed  LCLV  device  ami  an  electrically  arMressed  CRT-LCLV  (Xian 
Optical  and  Fine  Machine  Institute).  The  technical  fiinctions  are  as  follows: 


LCLV 

CRT-LCLV 

Size  of  image  plane 

^  50mm 

^  50mm 

Resolution  (0.5MTF) 

50  Ip/mm 

35  Ip/mm 

Contrast 

100:1 

50:1 

Response  time 

30  ms 

30  ms 

Sensitivity 

3.7  erg/cm^ 

2.5  A  ferroelectric  liquid  crystal  light  valve  (FLCLV)  (Shanghai  Institute  of  Optical 
Instruments).  A  FLCLV  has  been  developed.  The  technical  performance  is  as  follows: 


Size  of  image  plane 
Resolution  (0.5MTF) 

Response  time 
Lifetime 


>  ^  20mm 

>35  Ip/mm  (centre) 

>  25  Ip/mm  (off-centre) 
=  1~5  ms 

>  1  year 


2.6  CdS,  ZnS  F-P  device  array  for  thresholding,  feedback  ami  gain  (Changchun  Institute  of 
Optics  and  Fine  Machines).  CdS,  ZnS  F-P  device  arrays  has  been  developed.  Its  threshold  is 
implemented  with  the  intensity  sum  of  signal  and  holing  beams.  When  their  intensity  sum 
exceeds  the  threshold  of  the  device,  the  holding  beam  as  a  feedback  beam  will  pass  through 
the  device  array.  The  device  array  work-area  is  10  x  lOmm^,  125  x  125  elements;  the  element 
size  is  5pm  x  5pm;  the  space  between  two  elements  is  3  pm.  The  ZnS  device  array  is  suitable 
for  a  working  wavelength  of  514.5nm;  the  operation  energy  density  is  200nJ/pm^.  The  CdS 
device  array  is  suitable  for  a  working  wavelength  of  632. 8nm;  the  operation  energy  density  is 
20nJ/pm^. 

2. 7  Nonresonant  low  power  real-time  storage  (Jilin  University).  Two  materials  for  real-time 
storage  are  developed:  one  is  a  push-pull  azobenzene  compound  doped  polymethyl 
methacrylate  (PMMA)  thin  film.  It  is  shown  that  this  polymer  has  appreciable  third-order 
optical  nonlinearities  with  a  nonresonant  of  about  4  x  10*^  esu  for  a  He-Ne  laser  at 
632.8nm.  Another  is  biphoton  real-time  storage  with  methyl  orange  (MO)  and  ethyl  orange 
(EO)  sensitised  polyvinyl  alcohol  (PVA)  film.  Experimental  results:  Ar^  laser  (514.5nm) 
pump  power:  0.6Wcm'^  and  He-Ne  laser  lowest  write  power  0.2Wcm'^. 


3.  Architecture  and  system 

5.1  An  optoelectronic  hybrid  parallel  multiprocessor  system  (Tianjin  University).  An 
optoelectronic  hybrid  parallel  multiprocessor  system  is  established.  It  consists  of  two  stages 
of  processor  arrays  (2DPA).  Its  architecture  is  a  3D  pipeline.  The  interconnection  in  the 
2DPAs  are  electronic,  and  it  can  be  easily  reconfigured  into  various  topologies.  The  optical 
interconnection  provides  the  communication  of  two  PEs  located  at  two  2DPAs,  respectively, 
in  the  third  dimension. 

5.2  Hybrid  morphological  processor  (Tsinghua  University).  An  optoelectronic  real-time 
morphological  processor  is  established.  A  Dammann  grating  produces  multiple  images, 
LCTV2  situated  at  the  spectrum  plane  of  the  Dammann  grating  is  used  to  display  the  structure 
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element.  When  the  light  illuminates  LCTVi,  the  convolution  between  the  input  image  and  the 
structure  element  is  performed.  A  CCD  camera  collects  the  convolution  result  into  a 
computer  for  thresholding.  Thus  morphological  operations,  such  as  dilation  erosion  can  be 
obtained.  Union  and  complement  are  also  realised. 

3.3  Optical  cellular  logic  image  processor  (Shanghai  Institute  of  Optics  and  Fine 
Mechanics).  A  one-operation  image  algebra  was  developed,  in  which  the  major  operator 
consists  of  a  parallel  binary  logic  operation  followed  by  a  binary  morphological  dilation 
operation.  Combined  with  threshold-decomposition  and  sum-superposition,  various 
morphological  functions  for  binary  and  gray-tone  image  processing  can  be  realised  by  the 
simple  iterative  use  of  the  operator.  A  four-layer  architecture  of  cellular  logic  array  was 
suggested  to  execute  the  algorithm  for  both  binary  and  gray-tone  image  processing.  The  CPU 
is  the  cascade  of  a  parallel  binary  logic  processor  and  a  binary  dilation  processor.  Three 
systems  were  constructed.  In  the  first  system,  the  logic  processor  is  a  multi-imaging  system 
with  dual-rail  spatial  coding  and  the  dilation  processor  is  simply  a  defocussed  unit  with  coded 
aperture.  In  the  second  system,  the  logic  processor  is  a  prism-optics  assembled  module  and 
the  dilation  processing  is  realised  by  a  Dammann  grating.  As  the  third  alternative,  an  optical 
module  scheme  using  3-D  stacked  integration  of  polarisation  elements  was  proposed.  This 
packaged  module  is  compact  and  effective.  In  all  the  configurations,  the  feedback  is  carried 
out  by  a  PC  computer. 

4.  Neural  network  processors 

4.1  A  fully-hipolar  neural  network  with  32  x  32  neurons.  A  coaxial  architecture  and  2D 
optically-bipolar  neural  network  with  32  x  32  neurons  is  developed.  The  IWM 
(Interconnection  Weight  Matrix)  is  implemented  with  a  transparent  SLM  or  a  slide  illuminated 
by  a  parallel  beam  from  a  fluorescent  lamp  with  spectra  and  X^.  The  IWM  is  placed  in 
the  front  focal  plane  of  a  lenslet  array.  The  light  passing  each  submatrix  is  coming  to  the 
liquid  crystal  switch  array  (LCSA)  through  the  relevant  lenslet  and  the  imaging  lens  Li.  The 
LCSA  is  the  conjugate  plane  of  the  IWM.  For  geometrical  optics  image  property,  N  x  N 
images  of  each  submatrix  will  overlay  together  on  the  LCSA.  Because  the  plane  of  the 
detector  array  is  conjugated  with  the  back  focal  plane  of  the  lenslet  array,  N  x  N  sharp  bright 
spots  which  represent  the  weighted  summations  will  be  obtained  on  the  detector  array  and  the 
positions  of  three  spots  will  shift  with  the  direction  of  the  parallel  beam  illuminating  the  IWM. 
Therefore,  on  the  surface  of  each  detector  array  element,  we  can  obtain  three  spots 
corresponding  io  and  due  to  dispersion  of  a  wedge.  By  proper  colour  encoding  for 
the  polarities  of  weights  and  neuron  states,  we  can  get  /^,/^,/^  corresponding  to  positive 

and  negative  weighted  summations.  The  operations  of  thresholding  and  feedback  are  fulfilled 
by  a  computer  for  the  next  iteration. 

4.2  Optoelectronic  hybrid  real-time  four-channel  associative  memory  system  (Tianjin 
University  and  Harbin  University).  This  system  includes  three  parts:  input  system,  processing 
unit  and  output  system.  The  proceeding  courses  are  as  follows: 

1.  Pre-processing  of  an  image:  A  large  number  of  reference  images  are  stored  in 
external  storage  carried  out  by  a  computer  disk,  after  being  processed  by  software  under  the 
control  of  a  computer. 

2.  Input  of  images:  As  a  four-channel  system,  the  monitor  screen  is  divided  into  one 
of  2x2  units  by  software.  Four  reference  images  and  a  recognised  object  are  displayed  on 
proper  units  of  the  monitor. 

3.  Associative  recognition:  Firstly,  four  different  reference  images  are  recorded  in 
four  photorefractive  crystals  by  means  of  a  Fourier  transform  (FT)  hologram.  Then  the  same 
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four  displayed  images  of  one  unknown  object  are  input  into  the  processing  unit,  where  they 
will  do  the  correlating  operation  with  the  reference  images  simultaneously.  The  results  are  the 
signal  beams  that  control  on-off  states  of  liquid  crystal  switches.  If  there  is  one  reference 
image  auto-correlating  with  the  unknown  one,  there  will  be  a  maximum  output.  As  a  result, 
the  corresponding  switch  is  on,  the  holding  beam  transmits  it  and  addresses  the 
photoreff active  crystal  in  this  channel. 

4.3  Optoelectronic  hybrid  system  for  image  information  extraction  and  associative 
recognition  (Beijing  Institute  of  Physics).  This  system  extracts  the  image  information  with 
cross-scanning;  a  vector  is  set  up  with  32  bits  of  information,  from  an  input  image.  The 
associative  recognition  result  is  obtained  by  the  inner-product  of  the  vector  and  a  stored 
matrix. 

4.4  Optoelectronic  implementation  of  a  compression-attraction-sphere  associated  memory 
using  an  optical  pattern-encoding  method  (Tsinghua  University).  A  compression-attraction- 
sphere  associative  memory  model  is  proposed.  An  optoelectronic  implementation  of  its  irm^- 
product  architecture  using  an  optical  spatial  pattern-encoding  method  is  demonstrated  by 
computer  simulation  and  preliminary  experiments. 

5.  Some  applications  in  pattern  recognition 

5.1  Four  channel  real-time  hybrid  joint  transform  correlator  (JTC)  (Tianjin  University  and 
Changchun  Institute  of  Optics  and  Fine  Mechanics).  A  four-channel  real-time  JTC  is 
proposed.  Two  optical  wave-front-division  multiplexers  (OWFDM)  divide  the  aperture  of  the 
system  into  four  channels.  In  this  system,  the  readout  light  energy  as  well  as  the  physical  area 
of  the  liquid  crystal  light  valve  (LCLV)  devices  can  be  fully  utilised.  When  this  system  is  used 
with  only  one  OWFDM  in  front  of  the  system,  it  can  work  at  a  speed  four  times  as  fast  as  a 
conventional  one-channel  system. 

5.2  3-D  spatial  object  recognition  using  analogue  optical  computing  (Huadong  University  of 
Science  and  Technology).  The  spatial  object  is  recognised  by  using  a  correlation  operation  in 
a  coherent  optical  filtering  system.  The  correlation  result  is  received  by  an  optoelectronic 
detector  array  which  is  placed  at  the  output  plane;  thus  the  recognised  result  is  given  through 
computer  comprehensive  judgement.  A  small  bank  of  filters  is  made  up  with  a  series  of  filters. 
The  key  step  in  this  method  is  to  design  and  make  a  bank  of  filters. 

5.3  Hit-or-miss  optical  pattern  recognition  system  (Tsinghua  University).  This  system  is 
based  on  the  morphological  hit-or-miss  operation  for  multi-target  recognition.  It  has  the  shift 
invariant,  size  invariant  and  rotation  invariant  property. 

5.4  An  optoelectronic  implementation  of  a  neural  network  for  recognising  handwriting 
characters  (Tsinghua  University).  This  system  consists  of  two  sub-networks.  The  input  layer 
and  hidden  layer  comprise  the  first  sub-network;  the  second  sub-network  implements  Boolean 
logic  (AND-OR)  operations.  It  combines  the  classification  discrimination  curved  surface  with 
a  super-plane  formed  by  the  first  sub-network. 

6.  Conclusions 

(1)  Under  the  united  organisation  all  the  studies  are  cooperatively  going  on. 

(2)  There  are  some  differences  between  the  National  Nature  Academic  Fund,  which  mainly 
supports  the  studies  of  areas  such  as  new  ideas,  new  architectures  and  new  kinds  of  devices, 
etc.,  and  the  National  863  Program,  which  lays  particular  emphasis  on  the  creative  researches 
in  practical  applications. 

(3)  The  first  stage  of  this  program  will  end  by  2000.  When  the  time  comes,  there  will  be 
some  available  optical  computing  systems. 
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An  optoelectronic  hybrid  parallel  multiprocessor  system  * 
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Abstract:  In  this  paper  the  architecture  of  a  3D  0-E  hybrid  multiprocessor  is 
proposed.  An  0-E  hybrid  interconnection  network  is  used  for  this  system, 

1.  The  architecture  of  the  optoelectronic  hybrid  parallel  multiprocessor  system 

An  optoelectronic  hybrid  parallel  multiprocessor  system  is  building  up,  in  which  there 
are  two  stages  of  2D  processor  arrays  (2DPA);  the  optical  interconnections  are 
employed  between  these  arrays,  as  shown  in  Fig.  1.  Its  architecture  is  a  3D  pipeline. 


2DPA,  2DPAa 


Fig.  1  The  scheme  of  0-E  hybrid  parallel  multiprocessor  system 

(2DPA)-2D  0-E  hybrid  Processor  Array,  (PA)-Processing  element  Array,  (SN)-Switch  Network,  (ERA)-Emitter  and 
Receiver  Array,  (FOIN)-Fiber  Optic  Interconnection  Network,  (FC)-Fiber-optic  Channel. 

Every  two  dimensional  0-E  hybrid  processor  array  comprises  N  x  N  processing 
elements  (PEs).  The  interconnection  in  2DPA  is  electronic.  It  can  be  easily 
reconfigured  into  various  topologies. 
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Eventually,  there  are  two  such  0-E  parallel  multiprocessor  system  connected  by 
multiple  bidirectional  optical  channels.  Fig.2  shows  the  diagram  of  this  0-E  parallel 
processor  array  system. 
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Fig.2  The  schem  of  physical  fiber  optical  interconnection 
network  (FOIN),  FC-Fiber  Channel 

2.  O-E  interconnection  network 

The  optical  interconnection  provides  the  communication  of  two  PEs  located  at  two 
2DAPs,  respectively,  in  the  third  dimension.  In  our  system,  the  fiber  optical 
interconnection  network  (FOIN)  shown  in  Fig.l  has  64  bidirectional  fiber  optical 
channels  (PC),  which  consists  of  a  emitter  array  with  64  LED  modules  and  a  receiver 
array  with  64  PIN  modules.  The  global  transmission  rate  of  the  FOIN  is  1 .28Gbits/s. 
The  test  results  show  that  the  probability  of  error  of  data  transmission  is  less  than  10-‘L 

In  the  practical  system,  the  switch  network  (SN)  shown  in  Fig.l.  is  the  crossbar  with 
64  X  64  (I/O)  ports  in  FOIN.  The  schem  of  interconnection  network  is  shown  in  Fig.2. 
It  can  be  seen  that,  there  are  four  crossbars  (i.  e.  SNs  in  Fig.l.),  which  play  important 
roles  in  either  the  optical  interconnection  or  topological  reconfiguration  of  2DPA. 
There  are  6  processing  elements  in  each  crossbar,  the  programmes  are  stored  in  the 
processing  elements. Chice  the  communication  between  two  2DPAs  is  needed,  the 
programmes  will  make  the  four  crossbars  connect  with  two  64  FC  module,  for  instance, 
crossbar  11  connects  with  crossbar21  by  64  FC,  crossbar  12  connects  with  crossbar22 
using  another  64  FC  simultanously.  The  manner  may  be  discribed  as  following 
equation: 


(2DPA)„(PE^  )^  (2DPA)^(PEJ 
where  n,  m=  1, 2,  i,  k=l  ^  8,  and  j,  1=1  ~  8. 

3.  Implementation  of  topological  reconfiguration 
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The  realization  of  topological  reconfiguration  of  each  2DPA  relies  on  two  crossbars 
with  64  X  64(1/0)  ports.  The  diagram  of  2DPA  is  shown  in  Fig.3.  The  crossbar  with 
64  X  64(1/0)  ports  in  our  system  is  combined  using  six  crossbars  with  32  x  32(1/0) 
ports.t^i  This  combined  crossbar  is  easily  programmable  to  connect  one  input  with  any 
one  output  channel. 


Fig.3  The  schem  of  interconnection  in  2DPA. 


(a)  (b) 

Fig.4  (a)  The  photograph  of  a  2DPA  built  up  with  (FOIFp. 

(b)  The  photograph  of  waveform  of  data  transmission 
in  the  optical  fiber  channel  (FC). 

It  can  be  seen,  fi-om  Fig.3,  a  processing  element  in  2DPA  has  four  pairs  of  high  speed 
links,  two  pairs  of  them  connect  other  two  elements  directly.  The  other  two  pairs  of 
links  are  connected  with  crossbarll  and  crossbarl2  respectively  to  form  complete 
connectivity.  By  programming  such  two  crossbars,  the  interconnection  among  the  64 
PEs  may  be  reconfigured.  Therefor,  various  topologies  such  as  mesh,  tree  and  hyper 
cube,  etc.,  may  be  reconfigured  depending  the  computing  algorithm. 
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The  hardware  of  a  2DPA  bxult  up  is  shown  in  Fig.4(a)  which  involves  the  fiber  optical 
interconnection  network  (FOEN).  Having  passed  various  tests  shows  that  tiiis  system 
has  very  fine  performances.  Fig.4(b)  is  a  photograph  of  data  transmission  signal  of  FC. 

4.  Conclusions 

1)  A  0-E  hybrid  multiprocessor  system  is  accomplished. 

2)  A  0-E  fiber  optic  intercoimection  network  is  used  in  this  system  successfully. 

3)  This  system  is  used  for  pattern  recopition  in  parallel  and  the  results  are  obtained 
much  quickly. 
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Abstract.  A  two-dimensional  array  of  microlasers  controls  a  mating  array  of  optical  modulators 
in  a  model  of  a  reconfigurable  optical  architecture.  Free-space  connections  are  enabled  through  the 
configuration  of  the  microlasers.  A  problem  in  scalability  is  addressed  in  which  arbitrary  microlaser 
patterns  are  created  by  using  a  sequence  of  subpatterns  that  require  only  row  and  column  control 
lines,  rather  than  using  an  individual  control  line  for  each  microlaser.  Although  pattern  composi¬ 
tion  with  this  method  may  require  several  cycles,  the  principle  of  functional  locality  ensures  that 
reconfiguration  takes  place  infrequently.  The  result  is  a  control  method  that  scales  gracefully  as  the 
size  of  the  architecture  increases.  An  example  of  a  mapping  of  an  arbitrary  interconnection  prob¬ 
lem  to  the  reconfigurable  architecture  is  presented. 


1.  An  Optically  Reconfigurable  Architecture 

Traditional  computer  architectures  are  static  in  the  sense  that  the  connectivity  among  the  inter¬ 
nal  components  does  not  change,  except  when  the  hardware  is  physically  rewired  or  circuit 
boards  are  changed.  Despite  the  static  nature  of  traditional  computer  architectures,  widely  varying 
behavior  can  be  achieved  as  a  result  of  modifying  the  software  that  the  architectures  execute. 
The  underlying  connectivity  among  the  components,  however,  remains  unchanged. 

The  instruction  set  in  a  static  architecture  is  fixed,  so  performance  improvements  are  typi¬ 
cally  made  before  the  architecture  is  finalized  by  optimizing  the  execution  of  the  most  fre¬ 
quently  occurring  instructions  [1].  For  a  given  program,  however,  only  20%  of  the  instruction 
set  is  used,  and  an  even  smaller  percentage  (about  10%)  accounts  for  the  bulk  of  instruction 
execution  (about  85%)  over  a  period  of  time  [2].  This  property  of  functional  locality  can  be 
exploited  to  improve  performance  if  the  architecture  is  optimized  for  the  instructions  that  are 
currently  needed  (the  working  set)  rather  than  using  a  static  architecture  that  is  optimized  for 
the  average  mix  of  instructions.  In  order  to  optimize  the  architecture  for  the  execution  of  the 
working  set,  a  mechanism  for  reconfiguring  the  connectivity  among  the  architecture  compo¬ 
nents  is  needed  that  does  not  introduce  significant  latency  in  the  clock  cycle. 

Although  reconfiguration  can  be  implemented  with  a  variety  of  electronic  approaches,  a 
significant  latency  is  introduced  when  components  are  interconnected  with  an  electronic  mul¬ 
tistage  interconnection  network  (MIN).  Further,  a  significant  amount  of  time  is  required  to 
reconfigure  communication  channels.  We  address  the  latency  problem  by  using  a  shallow  free- 
space  optical  implementation,  and  we  address  the  reconfiguration  problem  by  narrowing  the 
control  stream. 

One  way  that  the  latency  issue  can  be  addressed  is  through  the  use  of  beam-steering,  which 
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Figure  1:  Model  of  reconfigurable  optical  architecture. 

is  analogous  to  physically  rewiring  an  electronic  processor.  An  alternative  approach  that  we 
explore  here  is  to  provide  a  dense  array  of  free-space  connections,  and  selectively  enable  the 
connections  that  are  needed.  The  underlying  architecture  for  this  approach  is  shown  in  Figure 
LA  two-dimensional  array  of  processing  elements  (PEs)  communicates  through  optical  input/ 
output  (I/O)  ports.  Optical  signals  travel  in  free-space  through  a  static  optical  interconnect 
from  the  PE  array  to  an  array  of  optical  modulators  (S-SEEDs  [3]  for  this  case)  and  continue 
back  to  the  PE  aiTay.  An  array  of  configuring  microlasers  sets  the  states  of  the  modulators  in 
parallel  to  be  reflective  or  absorptive.  A  second  microlaser  array  provides  optical  inputs  to  the 
PEs. 

A  similar  system  (without  the  PEs)  was  recently  demonstrated  in  the  U.S.  Air  Force’s 
Photonics  Center  at  Rome  Laboratory  /  Griffiss  AFB  [4]  in  which  vertical  cavity  surface-emit¬ 
ting  lasers  (VCSELs)  [5]  configure  the  S-SEEDs  and  also  serve  as  inputs.  In  the  Photonics 
Center  system,  modulated  beams  (the  signal  beams  that  are  reflected  from  the  S-SEED  array 
back  to  the  PE  array)  are  generated  by  discrete  edge  emitting  lasers  and  binary  phase  grating 
array-generators.  This  method  of  generating  an  array  of  beams  can  be  replaced  with  a  microlaser 
array,  leading  to  the  model  shown  in  Figure  1 . 

A  problem  in  scalability  for  this  style  of  processor  is  that  the  configuring  microlaser  array 
needs  to  be  controlled  in  such  a  way  that  only  a  narrow  control  stream  is  needed  between  the 
configuring  microlaser  controller  and  the  microlaser  array.  Section  3  addresses  the  control 
stream  issue  in  the  context  of  an  example  presented  in  Section  2,  which  shows  how  a  circuit  is 
mapped  onto  the  architecture. 


2.  Mapping  a  Circuit  onto  the  Architecture 

We  are  using  a  method  of  designing  systems  in  which  PEs  are  interconnected  with  a  single 
two-dimensional  optical  perfect  shuffle.  The  goal  is  to  achieve  all  needed  connections  using  a 
single  pass  through  the  perfect  shuffle,  although  multiple  passes  may  be  needed.  A  simplified 
form  of  the  problem  is  illustrated  in  Figure  2a,  in  which  16  PEs  that  each  have  four  optical 
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Figure  2:  (a)  Perfect  shuffle  interconnected  PE  array;  (b)  netlistfor  the  circuit. 

inputs  and  four  optical  outputs  are  interconnected  with  a  64-channel  one-dimensional  perfect 
shuffle.  Architecturally,  the  one-dimensional  perfect  shuffle  corresponds  to  an  unfolded  ver¬ 
sion  of  a  two-dimensional  perfect  shuffle.  The  unfolded  version  can  take  many  different  forms, 
and  the  form  shown  in  Figure  2a  is  used  for  its  visual  regularity. 

Using  software  developed  for  this  application,  a  connection  list  among  components  (a 
netlist)  is  automatically  translated  into  a  placement  of  PEs  for  the  perfect  shuffle.  The  netlist 
shown  in  Figure  2b,  which  was  generated  randomly,  led  to  the  mapping  shown  in  Figure  2a. 
More  specific  applications  have  also  been  mapped  such  as  various  types  of  binary  adders  [6]. 
Although  a  successful  mapping  of  a  netlist  to  a  single  stage  of  a  two-dimensional  perfect  shufQe 
is  possible  for  the  case  shown,  not  all  encountered  mappings  can  be  satisfied  with  this  ap¬ 
proach.  For  these  cases,  multiple  passes  through  the  network  are  required.  A  key  to  making 
successful  mappings  for  those  that  can  be  satisfied  with  a  single  pass  is  to  use  only  a  fraction  of 
the  available  connections.  For  this  example,  only  12  of  the  64  available  connections  are  used. 


3.  A  Scalable  Method  of  Controlling  the  Microlaser  Array 

Once  a  mapping  from  a  netlist  to  the  shuffle-connected  PE  array  is  obtained,  a  control  se¬ 
quence  needs  to  be  created  that  will  enable  the  connections.  The  64  channels  of  the  one-dimen¬ 
sional  perfect  shuffle  shown  in  Figure  2a  are  folded  in  raster  fashion  into  an  8x8  two-dimen¬ 
sional  matrix  in  which  I’s  and  O’s  represent  enabled  and  disabled  connections,  respectively,  as 
shown  in  Figure  3a.  For  instance,  the  second  row  from  the  top  in  Figure  3a  corresponds  to 
connections  originating  from  the  PEg,PE2  pair. 

For  an  array  of  individually  addressable  microlasers,  each  microlaser  has  an  independent 
control  line.  Ideally,  we  would  like  to  apply  the  full  64  control  bits  for  the  matrix  shown  in 
Figure  3a  to  the  configuring  microlaser  array  all  at  once.  As  the  size  of  the  matrix  increases, 
however,  the  number  of  control  bits  quickly  increases  to  sizes  that  are  impractical  to  implement 
using  an  electronic  controller.  While  the  control  problem  might  be  solved  optically,  a  more 
scalable  electronic  approach  is  possible  in  which  only  row  and  column  addressing  is  needed 
rather  than  independent  control  of  each  microlaser.  This  is  referred  to  as  matrix  addressable 
control  of  the  microlaser  array. 

The  matrix  addressable  configuration  is  more  scalable  than  the  individually  addressable 
configuration  because  the  number  of  control  lines  scales  by  only  A^+Mfor  an  AxM  array,  unlike 
an  individually  addressable  array  in  which  the  number  of  control  lines  scales  by  NxM.  Arbi¬ 
trary  patterns  cannot  always  be  implemented  with  a  single  row/column  control  vector,  how- 
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Figure  3:  (a)  Microlaser  configuration  for  the  connection  pattern  shown  in  Figure  2a;  (b-f)  matrix  addressable 
decomposition  of  the  control  pattern  shown  in  Figure  3a. 

ever.  For  these  cases,  the  target  pattern  must  be  decomposed  into  a  number  of  subpatterns  that 
are  applied  in  sequence,  which  has  the  effect  of  increasing  the  reconfiguration  time.  In  the 
worst  case,  the  number  of  subpatterns  equals  the  number  of  rows  (or  columns,  depending  on 
the  orientation),  but  no  greater,  since  the  columns  and  rows  are  independently  modulated. 

Using  decomposition  software  developed  for  this  problem,  the  target  pattern  shown  in 
Figure  3a  is  decomposed  into  five  patterns  shown  in  Figures  3b-f,  which  is  less  than  the  worst 
case  of  eight  subpattems  (actually,  a  row-by-row  decomposition  would  result  in  six  subpattems). 
The  savings  may  be  small  for  small  matrices  such  as  this,  but  can  be  substantial  for  larger 
matrices.  Fewer  subpattems  may  also  result  if  the  S-SEED  element  at  each  row-column  inter¬ 
section  is  toggled  rather  than  being  set  to  an  absolute  state.  In  this  scenario,  the  entire  S-SEED 
array  is  initially  enabled,  and  then  the  subpattems  disable  S-SEED  elements  rather  than  enable 
them. 


4.  Conclusion 

A  model  for  a  reconfigurable  optical  architecture  makes  use  of  a  two-dimensional  microlaser 
controlled  array  of  optical  modulators  interconnected  by  a  two-dimensional  perfect  shuffle. 
Mappings  of  arbitrary  netlists  to  the  architecture  can  be  made  for  a  number  of  cases  with  only 
a  single  pass  through  the  perfect  shuffle.  The  control  stream  that  enables  perfect  shuffle  chan¬ 
nels  is  decomposed  into  matrix  addressable  subpattems  through  a  scalable  process.  The  main 
conclusion  is  that  a  flexible  reconfigurable  optical  architecture  can  be  achieved  with  a  shallow 
static  interconnect  and  a  narrow  control  stream. 

The  work  reported  in  Section  3  was  jointly  supported  by  the  Air  Force  Office  of  Scientific 
Research  and  the  National  Science  Foundation  (NSF)  on  NSF  grant  ECS  93-12625.  The  re¬ 
mainder  of  the  work  was  supported  by  NSF  on  grant  (MIP  92-24707). 
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Optical  array  logic  network  architecture 


J  Tanida  and  Y  Ichioka 
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Abstract.  A  new  concept  called  optical  array  logic  network  architecture  (OAL-NA) 
is  proposed  for  effective  construction  of  an  optoelectronic  hybrid  computing  system. 
With  the  help  of  optical  array  logic  (OAL),  not  only  data  communication  but  also 
global  data  processing  is  implemented  in  the  interconnection  network.  In  this 
paper,  the  concept  of  the  OAL-NA  and  possible  operation  of  the  architecture  is 
described. 


1.  Introduction 

Digital  optical  computing  is  a  promising  scheme  for  massively  parallel 
computation  in  the  near  future  because  of  the  common  foundation  with  the 
current  computer  science.  According  to  the  concept,  a  lot  of  techniques  including 
optical  array  logic  (OAL)  [1]  have  been  proposed.  However,  quantitative 
estimation  of  processing  capability  shows  that  a  processing  style  in  which  all 
operations  are  executed  by  optical  techniques  is  inefficient  and  it  is  difficult  to 
achieve  processing  capability  comparable  to  electronic  computing  systems. 
Current  electronic  computers  offer  tremendous  computational  power,  so  that 
massive  parallelism  and  ultra  fast  operation  must  be  exploited  in  the  optical 
computing  systems  to  overcome  the  electronic  ones.  Considering  the  status  of 
researches  on  optical  computing,  we  conclude  that  at  least  in  near  future 
optoelectronic  hybrid  computing  systems  are  promising  and  practical.  However, 
even  for  optoelectronic  hybrid  computing  system,  sophisticated  architecture  is 
required  to  fully  utilize  the  capabilities  of  optical  computing  techniques.  In  this 
paper,  we  propose  a  concept  of  optoelectronic  hybrid  computing  system,  in  which 
OAL  is  used  for  a  high  functionality  interconnection  network. 

2.  Optical  array  logic 

OAL  is  a  paradigm  for  optical  parallel  logical  processing  based  on  spatial 
encoding  and  discrete  digital  correlation  as  shown  in  Fig.  1.  Since  OAL  is 
composed  of  simple  operations  on  image  data,  various  optical  techniques  can  be 
used  to  implement  it  and  large  processing  capabilities  can  be  accomplished.  In 
addition,  OAL  is  based  on  a  conventional  logic  system,  so  that  accumulated 
resources  in  computer  science  can  be  utilized.  Until  now,  various  application 
fields,  such  as  image  processing  [2],  numerical  processing  [3],  emulation  of 
parallel  processor  [4],  inference  [5],  and  database  management  [6],  have  been 
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Fig.  1  Processing  procedure  of  OAL 
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investigated  with  OAL. 

However,  quantitative  estimation  of  the  processing  capability  of  the 
developed  algorithms  in  OAL  suggests  that  such  processing  cannot  provide 
capability  comparable  to  the  current  electronic  computers.  For  example,  to 
achieve  200MIPS  (million  instructions  per  second),  the  performance  of  a  medium 
level  microprocessor,  by  OAL  with  1000  x  1000  pixels,  the  frame  rate  must  be 
60kHz  for  16bit  addition  and  7.2MHz  for  16bit  multiplication.  Although  these 
values  are  not  unachievable  ones,  it  is  difficult  to  find  advantages  of  the  OAL 
scheme  with  this  processing  style.  Thus,  we  have  investigated  the  developed 
algorithms  and  the  characteristics  of  the  OAL  scheme  to  overcome  the  problem. 

As  the  characteristics  of  the  OAL  scheme,  the  following  are  obtained,  1) 
Single  complicated  operation  is  more  effective  than  a  sequence  of  simple 
operations.  2)  Template  matching  and  image  shifting  are  good  examples  of 
effective  processing.  3)  MIMD  (multiple  instruction  stream  multiple  data 
stream)  processing  is  difficult  to  implement  with  simple  optical  techniques.  4) 
Programmability  of  OAL  is  an  important  feature  to  obtain  controllability  in 
optical  systems.  Considering  these  characteristics,  the  optical  array  logic 
network  architecture  (OAL-NA)  is  developed. 

3.  Concept  of  OAL-NA 

The  OAL-NA  is  a  conceptual  architecture  of  an  optoelectronic  hybrid  computing 
system  as  shown  in  Fig.  2.  Essentially,  multiple  electronic  processors  execute 
arithmetic  and  logical  operations  in  parallel  and  they  communicate  with  each 
other  through  an  optical  interconnection  network.  The  most  important  feature 
of  the  OAL-NA  is  that  the  optical  interconnection  is  implemented  by  an  OAL 
processor  and  provides  high  functionality:  data  transfer,  data  test,  data 
processing,  and  system  control. 

Figure  3  shows  a  schematic  diagram  of  an  example  system  of  the  OAL-NA. 
In  the  system,  multiple  processing  elements  (PE's)  with  local  memories  are 
located  on  one  plane  (PE  array)  and  a  set  of  a  global-sharing-memory  and 
input/output  processors  are  arranged  onto  another  plane  (MIO  array).  The  two 
arrays  are  connected  by  an  OAL  network  processor  with  symmetric  data  flow. 


PE  Array 


MIO  Array 
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Fig.  2  Concept  of  the  OAL-NA 
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Fig.  3  Example  system  of  the  OAL-NA 


Output  signals  from  one  array  are  used  as  inputs  to  the  following  OAL  processor 
and  the  output  of  the  OAL  processor  is  transferred  to  the  other  array.  Electronic 
modules  on  the  array  complete  the  encoding  required  in  OAL  and  generate  a 
coded  image  by  composition  of  individual  elements  of  the  modules.  Once  the 
coded  image  is  generated,  the  output  signals  from  the  electronic  modules  are 
processed  by  OAL.  In  the  reverse  data  transfer,  the  same  procedure  is  adopted. 
For  inter-PE  communication,  round-trip  routing  is  established.  Specifying  the 
OAL  processor,  we  can  achieve  versatile  functions. 


4.  Operations  in  OAL-NA 


The  OAL-NA  provides  various  operations  in  the  interconnection  level.  Typical 
operations  are  1)  data  transfer  (unconditional/conditional  broadcasting,  bi¬ 
directional  shift  [41,  data  exchange,  token  propagation  [5]),  2)  data  test 
(conditional  search,  data  validity  check,  convergence  check),  3)  data  processing 
(differentiation,  logical  operation  for  same/neighbor  PE's),  4)  system  control 
(command  sending,  arbitration),  and  so  on.  These  operations  are  described  by 
OAL  programs  and  easy  to  executed  by  an  OAL  processor.  Figure  4  shows  the 
examples  and  corresponding  OAL  programs.  Note  that  the  left-hand  images  are 
the  outputs  of  the  PE  array  and  that  the  pixel  corresponds  to  the  output  of  an 
individual  processing  element.  The  right-hand  images  are  transfered  into  the 
following  MIO  or  PE  array.  Accumulated  resources  in  OAL  research  can  be  fully 
utilized  for  the  OAL-NA. 


5.  Discussion 


The  OAL-NA  has  various  advantages  with  respect  to  system  construction  and 
system  capability.  Use  of  highly  developed  microprocessor  and  smart  pixel 
technologies  for  PE's  is  practical  and  matches  with  the  current  trend  of 
massively  parallel  architecture  in  computer  science.  It  has  been  pointed  out  that 
optical  interconnection  at  interchip  level  is  advantageous  over  electronic.  In 
addition,  interconnection  merging  with  processing,  the  key  idea  of  the  OAL-NA, 
is  expected  to  provide  more  effective  communication  and  system  operation  than 
the  conventional  simple  optical  interconnection.  OAL  implements  SIMD  (single 
instruction  stream  multiple  data  stream)  parallel  processing,  which  is  suitable 
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Fig.  4  Example  operations  in  the  OAL-NA 


for  global  search  and  test  on  the  output  data  from  PE's,  memory,  and 
input/output  processors.  In  terms  of  processing  complexity,  global  search  and 
test  are  appropriate  for  the  OAL  scheme  and  the  advantages  of  OAL  can  be 
effectively  utilized.  Although  an  encoding  device  must  be  prepared  even  for  a 
simple  OAL  processor,  the  encoding  process  can  be  combined  into  each  electronic 
modules  and  no  additional  device  is  required  in  the  OAL-NA. 

The  operational  sequence  of  the  OAL  network  processor  consists  of  two 
phases:  1)  kernel  setup  phase  and  2)  signal  transfer  phase.  The  transmission 
rate  T.R.  of  the  OAL  network  processor  is  estimated  by  Eq.  (1). 


T.R.= 


®  ^^setup  ^  ^tran^ 


where  a  and  b  are  the  required  number  of  kernel  setups  and  the  bit  number  of 
the  transfer  data,  respectively;  tsetup  and  ttrans  are  the  setup  time  and  the 
transfer  time.  Equation  (1)  indicates  that  T.R.  is  strongly  affected  by  a  and  that 
the  overhead  of  kernel  setup  is  reduced  as  6  increases.  Fortunately,  for  most  of 
the  operations,  a  can  be  set  to  unity,  so  that  the  OAL-NA  is  expected  to  be 
operated  without  serious  performance  degration. 

6.  Summary 

In  this  paper,  we  have  proposed  a  new  concept  called  optical  array  logic  network 
architecture  for  effective  construction  of  an  optoelectronic  hybrid  computing 
system.  With  the  help  of  optical  array  logic,  not  only  data  communication  but 
also  global  data  processing  are  implemented  in  the  interconnection  network.  As 
a  result,  effective  data  communication  and  system  operation  can  be  expected. 

References 

[1]  Tanida  J  and  Ichioka  Y  1990  Int.  J.  Opt.  Comput.  1,  113-28 

[2]  Tanida  J  and  Ichioka  Y  1988  Appl.  Opt.  27,  2926-30 

[3]  Tanida  J,  Fukui  M  and  Ichioka  Y  1988  App/.  Opt.  27  2931-9 

[4]  Fukui  M,  Tanida  J  and  Ichioka  Y  1990  Appl.  Opt.  29  1604-9 

[5]  Iwata  M,  Tanida  J  and  Ichioka  Y  1992  Appl.  Opt.  31,  1093-102 

[6]  Iwata  M,  Tanida  J  and  Ichioka  Y  1993  AppZ.  Opt.  32  1987-95 


Inst.  Phys.  Conf.  Ser.  No  139:  Part  I 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  lOP  Publishing  Ltd 


87 


Optoelectronic  multiport  associative  memory  for  data  flow 
computing  architecture 


V  B  Fyodorov 

Scientific  Computer  Centre,  Russian  Academy  of  Sciences,  Leninsky  pr.  32  A,  Moscow, 
117334,  Russia 

Abstract.  A  new  multiport  associative  memory  based  on  optoelectronic  principle  of  data 
processing  is  suggested.  This  memory  enables  M  users  to  execute  simultaneously  and 
independently  a  parallel  content-based  keys  search  and  data  retrieval  into  a  common  memory  of 
N  stored  words  by  M  search  arguments,  as  well  as  a  random-access  writing  of  keys  and  data. 
The  main  parameters  of  such  an  associative  memory  are  evaluated  and  its  hardware 
implementation  is  discussed. 


1.  Introduction 

An  associative  memory  (AM)  is  of  primary  importance  for  data  flow  supercomputers  [1-2] 
and  processing  neural  networks.  Attempts  to  create  a  fast  high-capacity  AM  based  on  various 
physical  principles  have  been  repeatedly  made  almost  since  the  first  computers  appeared. 
However,  all  these  attempts  either  proved  to  be  unsuccessful  for  fundamental  or  technical 
reasons  or  the  implemented  AM  models  could  not  be  used  in  computers  because  of  their  poor 
relative  economical  efficiency. 

In  the  present  paper  the  creating  principles  and  optical  hardware  setup  of  the  new  high¬ 
speed,  high-capacity  optoelectronic  multiport  AM  for  parallel  systems  with  data  flow 
organization  of  computing  process  are  suggested.  The  main  parameters  of  such  an  AM  are 
evaluated  and  its  hardware  implementation  is  discussed.  The  possibility  of  using  the  suggested 
AM  as  an  optoelectronic  multiport  random-access  memory  is  considered.  The  principle  of 
data  flow  computing  organization  based  on  using  such  memories  is  discussed. 

2.  Multiport  associative  memory  organization 

The  multiport  AM  is  a  storage  that  enables  M  users  to  execute  simultaneously  and 
independently  a  parallel  content-based  key  search  and  date  retrieval  into  common  memory 
array  of  N  words  by  M  search  arguments,  as  well  as  a  random-access  writing  of  keys  and 
data.  The  block  diagram  of  the  M-port  AM  is  represented  in  Fig.  1.  The  data  search  process 
for  such  a  memory  for  the  m-th  port  (w  ==  1,2,  ...  ,  M)  consists  of  the  following.  A  binary 
code  ...  a^^  of  a  key  to  be  found  is  written  in  the  w-th  search  argument  register  and 
compared  with  keys  ’  N)  stored  words.  The  results  of  a 

comparison  are  fixed  in  the  m-th  response  store.  The  addresses  of  coincidences  that  occur  are 
revealed  in  turn  by  a  priority  multiple-match  resolver  and  converted  in  AM  address  codes 
according  to  which  the  stored  in  the  memory  cells  can  then  read  out. 
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The  most  difficult  problem  is  to  provide  a  solution  to  three  correlating  tasks:  fast 
simultaneous  transfer  of  the  produced  search  arguments  to  each  memory  cell,  bit-wise 
comparison  of  each  search  argument  with  all  stored  keys  (the  logical  function  a®b)  and 
revealing  of  coincidence  occurrences  with  subsequent  information  retrieval  according  to  the 
located  physical  addresses  of  keys.  This  problem  is  difficult  to  solve  mainly  because  of  the 
necessity  of  realising  a  great  many  interconnections  between  memory  cells  and  other 
multiport  AM  devices. 


3.  Optoelectronic  implementation  of  the  M-port  associative  memory 

The  proposed  M-port  AM  has  two  distinguished  features.  Firstly,  to  bypass  the  problem  of  a 
high-speed  light-sensitive  reversible  recording  medium,  writing,  erasing,  and  storing 
information  in  such  an  AM  are  fulfilled  by  electronic  methods,  whereas  the  key  search  is 
realized  by  optical  means.  For  this  purpose  the  memory  array  is  implemented  in  the  form  of 
two  blocks;  optical  block  of  the  associative  search  (SAM)  on  the  basis  of  a  hybrid  memory 
board  (HMB)  with  optoelectronic  individually  electrically  addressed  VLSI/SLM  smart  pixels 
(see,  as  an  example,  [3])  in  which  the  compared  bits  of  all  the  words  are  stored  and  block  of 
the  random-access  retrieval  on  the  basis,  for  example,  of  a  semiconductor  M-module  random- 
access  memory  (RAM)  or  an  optoelectronic  M-port  memory  to  store  and  address  read  out 
the  data  bits  of  the  words.  The  HMB  is  a  2-D  memory  cell  array,  each  memory  cell  consisting 
of  spatial  light  modulator  (SLM)  and  MOS-transistor-based  trigger  placed  near  electrically 
connected  to  the  SLM  and  with  the  RAM. 

Secondly,  the  like  bits  of  the  N  keys  are  grouped  together  on  HMB  in  2L  fields:  2 

•  ^IN  ^  •••  ^2L1^2L2  •••  ^2L,N  ^  allows  One  to  usc  for  an  optical  system  of  the  AM  the 
lenses  of  an  available  size,  since  N»2Las  a  rule.  A  graph  of  the  parallel-by  search  argument, 
parallel  by-bit  content-based  search  of  stored  keys  for  such  a  bit  arrangement  on  HMB  is 
depicted  in  Fig.  2.  If  optical  data  processing  methods  are  implied  it  is  preferable  to  use  logical 
circuits  based  on  threshold  light  inverters  (multi-input  NOR  gates)  to  reveal  coincidences. 

The  scheme  of  a  block  for  the  parallel  optical  associative  search  of  keys'  addresses  is 
represented  in  Fig.  3  for  a  particular  case  for  M-4,  L=4,  and  N=16.  The  image  multiplication 
and  superimposition  in  such  an  optical  system  is  implemented  by  using  the  spherical  lenses 
only  (see  reference  [4]).  The  stored  keywords  are  registered  on  the  HMB  as  transparent  and 
opaque  areas  (ON  and  OFF  spatial  light  modulators)  corresponding  to  '1'  or  'O'  of  the  binary 
bit  bi,.  The  functions  of  masked  search  argument  registers  are  accomplished  by  the 
electrically  controlled  point  light  source  arrays.  The  availability  or  absence  of  the  light  beam 
corresponds  to  '1'  or  '0'  in  the  binaiy  bit  of  search  argument,  respectively.  The  physical 
addresses  c,„„  of  coincidence  occasions  (light  absence)  are  fixed  by  photodetector  arrays 
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Fig.  3.  The  optical  setup  for  an  optoelectronic  multiport  associative  memory, 
which  serve  as  threshold  optical  invertor  arrays.  The  binary  bits  ...  of  all  {m  = 

1,2.  ...  ,  M)  the  search  arguments  are  encoded  by  a  dual-rail  code  and  binary  bits  ••• 

of  all  (77  =  1,2,  ...  ,  N)  the  keywords  are  stored  in  an  inverse  dual-rail  code.  The  optical 
system  projects  each  search  argument  image  (plane  P^)  onto  all  the  keys  stored  on  the  HMB. 
Images  of  the  like  bit  fields  (plane  P/,)  transmitted  through  the  HMB  are  superimposed  L- 
times  in  each  output  port  (plane  P^.).  As  is  seen  in  Fig.  3  the  optical  threshold  inverter  denoted 
by  entry  c^„  fixes  the  fact  of  coincidence  of  the  m-ih.  search  argument  with  the  «-th  key,  i.e. 
the  inquiry  answer  of  m-th  user  appears  always  in  /w-output  port. 

The  optical  system  depicted  in  Fig.  3  ensures  the  associative  search  in  the  HMB  in  light 
transmission  mode.  However,  if  light  polarization  plane  modulators  are  employed  for  the 
HMB  it  can  be  modified  to  operate  in  reflected  light  mode  by  using  the  polarizing  beam 
splitter  cube  placed  before  the  lens  of  the  HMB.  It  is  significant  that  the  optical  system 
represented  in  Fig.  3  is  reciprocal  in  respect  to  optical  inputs  and  outputs,  and  hence  it  can 
function  as  an  multiport  optoelectronic  random  access  memory,  if  photoreceiver 

arrays  replace  by  N^-^xN'^-^  laser  source  arrays  and  L®-^xL°'^  laser  source  arrays  replace  by 
NO.5xn0-5  photoreceiver  arrays. 


4.  Key  coding  and  optical  one-match  resolver 


In  the  optoelectronic  AM  there  are  enough  reasons  leading  to  a  spread  of  logical  levels  for  T' 
and  'O'  light  signals.  Its  influence  on  the  search  operation  reliability  can  be  substantially 
diminished  by  using  a  redundant  code.  In  this  case,  the  minimum  difference  between  two 
arbitrary  code  combinations  will  be  defined  by  the  code  distance  d>\,  and  hence  the  level  of 
optical  signal  distinguishing  the  coincidence  of  the  search  argument  with  a  keyword  from 
noncoincidences  is  increased  (see  Fig.  4). 

Code  combinations  with  the  minimum  code  distance  d  >  A  can  be  realized  by 
representing  binary  words  in  Reed-Muller  codes  [5].  Such  codes  are  characterised  by  the 
following  parameters:  code  length  k  =  2^;  number  of  key  bits  /  =  Cq';  minimum  code 
distance  d  =  2^'^,  where  q  3  is  a  positive  integer,  h  ^  is  the  code  order.  The  analysis  shows 
that  the  codes  with  parameters  /,  k,  d  respectively  equal  to  (26,  32,  4),  (42,  64,  8),  (64,  128, 
16)  and  (130,  512,  64)  should  be  selected  from  many  Reed-Muller  codes.  In  this  case  d/k=^\!^ 
with  a  relatively  small  redundancy  a  =  k/l  =  1.3,  1.5,  2.0  and  3.9  as  well  as  sufficiently 
significant  decrease  in  requirements  by  the  factor  p  =  dl/k  =  3.3,  5.3,  8.0  and  16  to  the 
contrast  ratio  of  smart  SLM  pixels  and  a  light  power  spread  and  variation. 
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M-port  SAM  M-port  RAM 


I/O 


I/O 


Fig.  4,  Idealised  dependence  of  the  light  level  when  5  simplified  block  diagram  of 

executing  nonequivalence  operation  versus  the  data  flow  architectire;  PU  -  information 

number  p  of  noncoinciding  bits  in  /-bit  keys  in  the  processing  unit,  I/O  -  input/output  ports, 

case  of  non-excess  1  and  excess  2  Reed-Muller 
coding  with  d  ^  k!?,:  w(J)  and  w{k)  -  one  bit  energy 
for  1  and  2,  respectively;  Wn  -  a  noise  level), 


In  a  data  flow  computing  architecture  only  an  one-fold  coincidence  of  the  search 
argument  with  all  the  stored  keys  is  possible.  This  characteristic  property  allows  one 
essentially  to  simplify  the  optical  implementation  of  the  multiple-match  resolver  and  encoder 
(see  reference  [6]). 

5,  Possible  performances  of  the  M-port  optoelectronic  associative  memory  and  its 
application  for  a  data  flow  architecture 

In  the  case  of  optoelectronic  AM  implemented  according  to  the  scheme  in  Fig.  3,  the  values 
of  2LN  and  M  can  be  connected  by  the  following  relation:  2LMN  =  S  if  it  is 

supposed  that  a  centre-centre  pixel  spacing  corresponds  to  3 -fold  Rayleigh  criterion,  all  lenses 
with  focal  lengths  F^,  F2,  ...,  Fg  have  equal  numerical  apertures  u  and  hence  Fj  =  F5,  F2  =  F^  == 
Fg,  as  well  as  cross  sections  of  the  optical  system  are  of  the  same  aperture  size  S  ^  DxD.  In 
accordance  with  this  relation  for  the  concrete  values  S  =  8x8  cm,  u  =  0.25  and  the  light 
wavelength  A  =  0.84  pm  the  value  2LMN  =  2.5x10^  bits.  This  means  that  the  optical  block  of 
the  optoelectronic  AM  can  provide,  in  particular,  the  following  utmost  characteristics:  the 
number  of  port  M  =  256  and  the  memory  capacity  being  N  =  10"*  50-bit  dual-rail  coded  keys. 
As  this  takes  place,  a  centre-centre  spacing  of  pixels  in  the  HMP,  point  light  sources  and 
photoreceivers  in  arrays  are  equal  ~80  pm,  500  pm  and  50  pm,  respectively,  overall  length  of 
such  an  optical  system  is  equal  to  about  40  cm  and  the  aperture  size  of  the  smallest  lenses  d5  = 
D/n0  5  =  0.5  mm  {u  =  0.25).  Fabricating  of  array  of  such  lenses  for  forming  of  100x100 
resolvable  sports  will  be  no  problem  [7], 

The  associative  access  time  is  determined  by  many  factors  and  circumstances.  For  the 
optoelectronic  AM,  two  factors  of  principle,  related  to  the  finite  optical  signal  speed  and 
quantum  nature  of  light,  limit  the  minimum  achievable  associative  access  time.  The  first  factor 
limits  by  value  of ^  «  1-2  ns.  Such  time  is  necessary  for  passing  the  light  signals  from 

plane  to  plane  in  free-space  optical  system  of  an  optoelectronic  AM  (see  Fig.  3).  The 

second  factor  of  principle  determines  the  minimum  number  of  photons  (or  the  minimum 
energy  W,/,)  that  must  be  recorded  for  the  reliable  discovering  of  a  coincidence  and  hence 
allows  to  find  the  utmost  value  of  j  energy  consideration.  In  the  case  of  coherent 

radiation  the  threshold  sensitivity  of  the  arrays  containing  a  great  number  of  photoreceivers 
Wf/,  «  10-14  j  [g]  xhe  /^2  evaluated  by  the  following  expression:  - 

where  P  is  the  average  radiation  power  of  all  the  light  sources.  For  the  above- 
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mentioned  example  of  AM  (N=10'^,  M=256)  the  value  of  /^2  ~  2.5  ns,  if  the  Reed-Muller 
code  with  (42,  64,  8)  parameters  are  employed  and  P  =  80  W.  Hence  the  maximum 
achievable  rate  of  key  retrieval  limited  by  the  value  Mlt^  2  ~  keys  s'f  As  this  takes  place, 
the  radiation  power  of  each  laser  is  equal  to  5  mW;  the  admissible  heat  release  in  light  source 
arrays  (Q/),  HMB  (Q^)  and  photoreceiver  arrays  (Q^)  is  equal  to  Q/  =  (l-ri)P/r|  «  8.5  W  cm"^ 
and  Q_s^Q^r~  0.5P/S  «  0.6  W  cm'^.  Here,  it  is  suggested  that  the  coefficient  of  performance 
of  GaAs  lasers  r|  =  0. 15,  as  well  as  aperture  and  absorption  photon  losses  are  neglected. 

Since  for  a  HMB  data  storage  and  rewriting  are  accomplished  by  electronic  circuitry, 
memory  cell  modulators  do  not  need  to  have  a  threshold  light  modulation  characteristic  that  is 
obligatory  for  a  light-sensitive  reversible  recording  medium.  This  means  a  wide  class  of 
materials  can  be  used  for  SLM  smart  pixels.  We  think  that  it  is  possible  to  reach  in  the  direct 
addressing  mode  a  nanosecond  cycle  for  data  rewriting  in  HMB  cells  by  use  of  such  SLMs 
and  fast  MOS  transistor  circuits  fabricated  by  means  of  integrated  circuit  technology. 

The  exploitation  of  such  a  multiport  AM  in  data  flow  digital  computer  is  illustrated  by 
the  block  diagram  in  Fig.  5.  Any  result  coming  out  of  the  w-th  processing  unit  (PU)  searches 
by  special  code  (search  argument)  for  a  data  set  in  the  AM  to  be  transferred  to  the  next 
operation  for  the  execution.  When  the  optoelectronic  multiport  AM  contains  the  searched 
key,  the  data  stored  in  the  optoelectronic  multiport  RAM  are  read  out  and  united  data  set 
(two  as  a  rule)  passes  for  execution  by  the  next  operator  in  the  m-Xh  PU.  If  the  searched  key 
is  not  found,  the  key  together  with  the  datum  coming  out  of  the  PU  is  written  to  an  empty 
place  in  the  w-th  block  of  a  block-oriented  (for  writing  only!)  electronic  memory  array. 

Since  the  keys  are  unique,  the  architecture  shown  in  Fig.  5  eliminates  the  possibility  of 
memory  and  PU  collisions  during  the  data  processing.  Therefore,  there  is  no  need  to 
incorporate  MxM  truly  nonblocking  interconnection  networks  and  multiword  buffer 
memories  in  data  flow  hardware,  as  it  done,  if  M  modules  of  the  one-port  AM  are  used  [2]. 

6.  Conclusion 

Our  study  showed  that  it  has  become  possible  to  create  compact  high-performance  M-port 
optoelectronic  associative  and  random-access  memories  for  the  storage  of  N  L-bit  words  by 
an  integration  of  optical  and  electronic  methods  of  information  processing. 

The  total  number  of  global  free-space  interconnects  realized  in  such  memories  by 
optical  means  is  at  least  ~  2LMN  =  10^-10^.  It  is  quite  clear  that  the  physical  implementation 
of  such  a  number  of  independent  interconnects  in  a  limited  volume  is  impossible  by  electrical 
means. 
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Abstract:  The  design  and  implementation  of  an  optoelectronic  data  filter  are  discussed. 
Design  requirements  of  the  smart  pixel  arrays  and  system  optics  are  presented.  Preliminary 
system  testing  demonstrates  viability  of  an  optoelectronic  data  filter. 


1.  Introduction 

In  order  to  take  full  advantage  of  the  massive  transfer  rates  of  2-dimensional  and  3- 
dimensional  optical  memories  as  well  as  the  2-dimensional  processing  nature  of  optical 
systems,  new  optoelectronic  components  with  a  high  degree  of  parallelism  must  be  developed. 
These  components  must  be  fast,  perform  a  logic  function,  be  integratable  into  dense  arrays  and 
optically  addressed.  This  smart  pixel  approach  can  then  be  used  to  perform  parallel  operations 
on  a  full  page  of  data,  thereby  greatly  increasing  the  data  throughput.  Since  optoelectronic 
processing  should  not  compete  with  electronic  computing  but  rather  complement  it,  we  propose 
to  fabricate  optoelectronics  interfacing  processors  which  preprocess  data  in  order  to  decrease 
the  input  load  on  the  electronic  host  computer.  One  system  which  has  these  characteristics  is  an 
optoelectronic  database  filter  which  is  a  preprocessor  situated  between  a  high  speed  optical 
memory  and  an  electronic  processing  unit.  This  paper  describes  an  implementation  of  this  filter 
based  on  arrays  of  smart  pixels  containing  vertical  cavity  surface  emitting  lasers  (VCSELs)  and 
heterojunction  phototransistors  (HPTs). 


2.  Optoelectronic  Data  Filter  System 

The  data  filter  accepts  data  in  a  paged  format  and  carries  out  relational  operations  such  as 
projections  and  selections  by  comparing  the  data  with  search  arguments.  These  two  operations 
produce  a  subset  of  the  input  data  which  is  then  fed  to  the  electronic  processing  unit  via  an 
optoelectronic  RAM.  The  two  operations  are  carried  out  within  separate  modules  of  the  filter. 
These  modules  make  up  the  filter  as  shown  in  Figure  1  below;  each  module  contains  arrays  of 
logic  elements.  There  is  an  AND  array  in  both  the  projection  and  selection  modules  along  with 
an  XOR  array  in  the  selection  module.  A  description  of  the  data  filter's  operation  is  given  in 
[1]. 
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2.1.  Approach 

In  designing  the  system,  we  kept  the  following  goals  in  mind.  We  wanted  to  provide  a  viable 
platform  on  which  to  test  the  VCS EL/HPT  optoelectronic  elements  in  an  array  format,  and 
demonstrate  that  such  arrays  can  be  used  along  with  optical  elements  to  perform  computing 
operations  on  data  sets  in  a  highly  parallel  system. 

Our  first  method  of  implementation  is  to  build  the  system  around  arrays  of  hybrid  devices 
(i.e.,  the  HPT  and  VCSEL  arrays  are  on  separate  chips).  This  allows  us  to  test  the  concepts  of 
the  database  filter  and  identify  possible  design  problems  before  implementing  the  filter  using 
the  more  complex  and  costly  monolithic  design.  The  gate  configurations  used  in  the  filter  are 
shown  in  Figure  2.  The  AND  gate  requires  two  HPTs  which  drive  a  VCSEL,  The  VCSEL 
provides  the  gate  output  and  is  “on”  only  when  both  inputs  are  “on.”  There  are  four  HPTs  in 
the  XOR  gate  with  two  in  parallel  and  two  in  series.  Each  input  is  directed  to  one  parallel  and 
one  series  connected  HPT  pair.  The  VCSEL  is  “on”  when  only  one  of  the  inputs  is  “on.”  For 
the  XOR  gate,  the  design  of  the  HPTs  must  ensure  that  when  both  A  and  B  are  on,  the  series 
HPTs  can  shunt  enough  current  to  drop  the  VCSEL  current  below  threshold. 

2.2.  Array  Design 

There  were  three  main  requirements  in  the  design  of  the  XOR  and  AND  gate  arrays.  They  are: 
maximizing  gate  gain,  minimizing  pixel  size,  and  providing  large  optical  input  windows.  From 
Figure  1 ,  it  is  apparent  that  there  will  be  optical  signal  loss  due  to  the  beam  splitters  as  well  as 
losses  in  the  lenses  and  devices.  Therefore,  in  a  first  order  analysis  there  must  be  gate  gains  of 
at  least  four  to  overcome  the  losses  due  to  the  optical  signals  that  must  pass  through  a 
maximum  of  two  beam  splitters.  Maximizing  gate  gain  in  VCSEL/HPT-based  systems  has 
previously  been  investigated  [2].  Minimizing  the  pixel  size  provides  a  denser,  more  cost- 
efficient  array.  Finally,  providing  large  input  windows  reduces  losses  since  larger  portions  of 
the  optical  signals  enter  the  HPT  active  regions. 


Figure  1:  Schematic  of  database  filter  showing  projection  and  selection  modules!  1] 
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A  diagram  of  the  HPT  portion  of  the  XOR  pixels  is  shown  in  Figure  3.  As  required  by  the 
circuit  in  Figure  2a,  each  optical  signal  illuminates  two  HPTs.  Unfortunately,  the  hybrid 
implementation  shown  here  requires  that  each  pixel  have  an  off-chip  connection  to  a  VCSEL. 
Thus,  the  hybrid  approach  is  not  scalable;  however,  the  monolithic  approach  is  scalable  since 
the  inputs  and  outputs  are  strictly  optical. 


3.  Results 

The  hybrid  XOR  arrays  have  been  fabricated  from  InGaP/GaAs  HPTs  grown  by  GSMBE  on  a 
semi-insulating  substrate.  The  VCSELs  were  fabricated  on  a  separate  AlGaAs/GaAs  wafer 
grown  by  GSMBE  on  a  conducting  substrate.  After  fabrication,  the  XOR  array  was  tested  with 
the  initial  system  setup,  shown  in  Figure  4.  This  initial  setup  was  used  to  identify  the 
difficulties  in  building  the  filter  with  bulk  optics.  Later  versions  will  use  lenslet  arrays  for 
compactness  and  simplicity. 

Input  B 


Figure  3:  Layout  of  XOR  HPT  arrangement  and  full  XOR  array 


Input  A  Input  ■  Output 


Figure  4:  Initial  system  setup 
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Figure  5:  Photograph  of  (a)  input  pattern  on  XOR  gates;  (b)  output  pattern  on  VCSEL  array 


The  optical  input  came  from  an  edge  emitting  laser  at  850-nm,  and  was  split  into  two  paths  by 
an  initial  beam  splitter.  The  two  optical  signals  were  directed  with  mirrors  onto  two  masks, 
generating  input  patterns  A  and  B.  The  second  beam  splitter  combines  the  patterns  before  they 
are  focused  by  a  single  lens  onto  the  input  windows  of  the  XOR  gates.  The  VCSEL  outputs  of 
the  array  are  on  a  separate  chip.  As  observed  in  Figure  5,  the  VCSEL  array  gives  the  expected 
output  jjattem.  Effort  is  now  concentrated  on  building  the  separate  modules  of  the  system. 
Once  they  operate  correctly,  the  entire  filter  will  be  tested. 


4.  Conclusions 

We  have  proposed  on  optical  database  filter  which  is  based  on  HPT/VCSEL  smart  pixel  arrays. 
Initial  system  testing  has  suggested  that  the  filter  concepts  and  design  are  viable.  We  are 
currently  implementing  the  filter  using  bulk  optics  and  intend  on  utilizing  microlenses  in  the 
future  as  well  as  developing  the  system  using  monolithic  arrays. 
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Abstract.  The  design  of  micro  optical  systems  has  to  consider  the  features  of  micro  optical 
components.  We  introduce  design  concepts  for  micro  integrated  symbolic  substitution  systems. 


1.  Introduction 

Systolic  arrays  represent  a  concept  for  developing  highly  parallel  computer  systems  using 
regularly  interconnected  simple  processor  arrays  [1].  We  have  recently  demonstrated  that 
systolic  arrays  can  be  easily  mapped  to  symbolic  substitution  rules  [2]  and  thus  can  be 
implemented  optically.  We  have  constructed  an  optical  pipeline  adder  based  on  systolic 
arrays,  which  was  realised  with  macroscopic  optical  components. 

This  adder  consisted  of  an  array  of  8x8  half  adders,  performing  a  full  addition  of  8-bit, 
dual  rail  coded  numbers  in  a  pipeline  within  8  iterations.  The  active  array  consisted  of  16x16 
pixels.  From  space-bandwidth  considerations  one  can  derive  that  imaging  of  such  an  array 
requires  only  lens-diameters  of  a  few  hundred  microns.  Consequently  the  size  of  the  whole 
system  can  be  reduced  into  the  submillimeter  range  using  micro  optical  components.  A 
concept  for  miniaturising  free-space  optical  systems  was  recently  presented  [3,4].  With  this 
stacked  approach,  the  packing  density  can  be  increased  and  the  connectivity  of  three- 
dimensional  optical  systems  can  be  utilised  better  than  with  planar  integration. 


2.  Design  Concepts. 

Here  we  try  to  build  on  this  integration  concept  in  order  to  first  realise  a  concept  for  a 
miniaturised  version  of  the  optical  pipeline  adder.  Then  we  will  generalise  this  architecture  to 
implement  general  systolic  array  architectures.  The  concept  takes  the  features  of  micro  optical 
components  into  account  and  is  thus  compatible  with  the  fabrication  constraints. 

The  optical  pipeline  adder  consists  of  a  recognition  and  a  substitution  part.  Each  part  is 
realised  by  a  sequence  of  two  multiple  imaging  systems  (MIS).  Active  components  (NOR- 
and  OR-gate  arrays)  are  located  at  the  entrance/exit  of  each  part.  Each  MIS  (fig.  1)  consists 
of  two  Fourier  transform  stages,  which  are  constructed  as  light  pipes,  because  this 
configuration  offers  best  light  efficiency  and  resolution  [5]. 
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The  complete  sequence  of  components,  needed  to  realise  a  substitution  system  is  shown 
in  fig.  2.  This  stretched  out  version  could  in  principle  be  realised  by  a  stack  of  optical 
components.  There  are,  however,  a  series  of  issues  that  have  to  be  addressed 


Fig.  2:  Complete  symbolic  substitution  stage 


First,  the  number  of  layers,  to  be  stacked,  is  very  large  in  this  version.  This  may  cause 

problems  in  aligning  the  system.  , 

Second,  the  active  device  in  the  centre  of  this  sequence  would  have  to  be  realised  in 
transmission  mode,  which  is  undesirable  both  for  thermal  and  for  connectivity  reasons.  A 
more  realistic  design  should  place  the  active  components  at  the  ends  of  the  stack.  Thus  a 
folding  of  the  system  is  necessary. 


3.  Folded  Systems 

The  first  version  of  folded  systems,  shown  in  Fig.  3  has  deflecting  mirrors  in  the  Fourier 
domain.  The  deflecting  prisms  for  the  multiple  imaging  operation  have  to  be  positioned 
between  the  two  mirrors.  Here  also  a  pupil  has  to  be  placed  to  prevent  vignetting,  which 
results  from  the  distance  between  the  2  Fourier  systems. 


Input 


Fig.  3:  Fourier  folded  MIS 


This  is  not  compatible  with  a  layered  structure,  since  the  pupil  and  the  prisms  would 
have  to  be  oriented  perpendicular  to  the  substrate  surface,  in  order  to  be  located  in  the  centre 
between  the  mirrors. 
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Another  limitation  of  this  configuration  is  that  the  location  a  of  the  pupil  has  to  satisfy  the 
condition  D<a  <f/2,  where  D  is  the  lens  diameter.  The  largest  numeric  aperture  {N.A.)  is 
achieved  for  minimal  a.  Thus  the  best  configuration  D  ^  a^f/2  results  in  a  maximum  N.A.  of 
0.5.  The  light  efficiency  is  determined  by  the  diameter  of  the  pupil  Dp  =  D(l-a/j). 

In  our  second  approach  (fig.  4)  we  insert  the  mirrors  into  the  light  pipes.  Here  all 
components  are  arranged  in  separate  layers  and  no  vertically  oriented  components  are 
necessary.  In  this  approach,  we  move  the  image  plane  away  from  the  input  data  plane  and  the 
N.A.  is  consequently  also  limited  by  the  increased  distance  resulting  from  the  double  mirror 
reflection.  The  beam  splitters  required  in  this  approach  can  be  implemented  by  the  LIGA 
technique  [6].  The  minimum  distance  b  between  lenses  is  limited  by  ; 

-  the  size  of  the  folding  mirrors/beamsplitters  (max.  D  necessary) 

-  the  thickness  of  the  lens  layer 

-  the  thickness  of  the  prism  layer  dp^is^ 

=>  ^  =  D  +  3di^„^  +  dp^i^^ 


Interlace,  Deinterlace  Stage 


» . f 
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. i 

1 

! 

1  ^ 

NOR 


Fig.  4:  Fully  folded  substitution  design  with  active  components  at  the  ends  of  the  layer  structure 

In  a  next  step  we  can  exploit  the  fact  that  the  required  shifts  in  two  successive  stages 
are  identical.  Thus  the  same  hardware  can  be  used  in  the  forward  and  in  the  backward 
direction  if  a  reflective  array  is  placed  between  these  stages  (fig.  4,  right  side).  This  scheme 
can  be  replicated  infinitely  above  and  below.  Thus  the  initially  stretched  out  system  can  be 
arbitrarily  cascaded  to  achieve  complex  systems  and  also  make  full  use  of  the  available 
substrate  area. 


4.  Space  Bandwidth  Considerations 

We  have  recently  introduced  a  generalized  Fourier  system  [5]  based  on  the  light  pipe,  which 
allows  an  additional  distance  a  between  input/output  and  the  lenses,  thus  reducing  the 
distance  h  between  the  lenses.  Introducing  a  system  factor  L  ^  b/f ,  L  =  [0  ..  IJ  the 
generalized  Fourier  system  [5]  can  be  classified  with  L. 

L  =  0  :  2f-system  L  =  I  :  light-pipe 

The  dependencies  of  a,  b  and  the  space  bandwidth  product  N  (number  of  pixels  in  the 
image  plane)  are: 
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"'•^■"2-1  *  "  6.9A-/#-(2-L) 

Asuming  A  =  0.65fjm,ft  =  ^  and  a  lens  diameter  of  250 }m  the  figures  5,6  show  the 
graphs  for  a,  h  and  n  =  ^  (number  of  pixels  in  one  dimension)  depending  on  the  system 
factor  L. 


Fig.  5:  Distance  parameters  a  and  h  as  function  of 
the  system  factor  L 


Fig.  6;  One-dimensional  SWP  as  function 
of  the  ^stem  factor. 

=  250 fjm,  the  minimum  distance  of  a  can 
be  noted  that  the  pixel  configuration 


Assuming  a  thickness  of  the  prism  layer 
be  ^  125 f£n,  thus  resulting  in  w  -  72  pixel.  It  has  to 
assumes  a  pixel  pitch  of  two  times  the  pixel  size. 


5.  Conclusion: 

Here  we  investigated  the  design  of  folded  symbolic  substitution  systems  by  considering 
the  constrains  of  microoptical  components.  The  system  design  is  suitable  for  the  mapping  of 
Systolic  Arrays  to  symbolic  substitution  rules.  The  folding  of  the  systems  is  done  by  mirrors, 
beamsplitters  in  the  Fourier  domain  or  within  the  light  pipe.  Using  common  micro  lens  arrays 
with  the  system  offer  only  a  limited  space  bandwidth  product  01=12^  pixel).  For  multiple 
imaging  architectures  with  multiple  lenses  can  work  around  these  limitations  [7,  8]. 
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Optoelectronic  3-D  Architectures  and  appropriate 
Algorithms  for  TFLOP  Computing 
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Abstract.  A  3-D  computer  architecture  with  smart  pixels  is  presented  exploiting  dense  optical 
interconnections  and  appropriate  algorithms  to  combine  parallel  processing  techniques  as  pipe¬ 
lining  and  array  processing.  Simulation  results  specify  the  needed  hardware  requirements  to 
enter  the  TFLOP  range. 


1.  Introduction 

The  challenge  to  build  parallel  computers  with  a  TFLOP  peak  performance  requires  not  only 
fast  processing  elements  (>  Ins)  but  high  communication  bandwidth  (>  1  TBit/s).  High  inte¬ 
grated  electronic  logic  and  high-dense  optical  interconnections  in  smart  pixel  systems  fulfil 
in  principle  these  requirements.  3-D  computers  with  smart  pixels  offer  a  large  potential  of 
computing  performance  [1],  [2].  Our  architecture  concept  allows  the  combination  of  the  two 
fundamental  techniques  in  parallel  computing,  pipeline  and  array  processing. 

An  important  question  in  this  context  is,  how  complex  a  simple  smart  pixel  proces¬ 
sing  element  should  be?  We  show  that  small  processing  elements  are  to  prefer  because  they 
offer  more  parallelism  than  larger  processing  elements.  The  use  of  small  processing  el¬ 
ements  does  not  mean  inevitably  a  loss  of  computing  performance  if  pipeline  mechanisms 
are  consequently  exploited.  This  needs  appropriate  algorithms  and  a  special  smart  pixels 
architecture  as  we  present  it  in  the  next  two  sections.  In  our  investigation  we  assumed  a 
smart  pixels  system  consisting  of  in  silicon  integrated  smart  detectors  and  an  externally  re¬ 
alised  sender  array  with  surface  emitting  microlasers. 


2.  Bit  algorithms 

In  numerical  processing  the  computation  of  complex  functions  as  floating-point  divisions, 
sine,  cosine,  logarithm  and  exponential  function  is  absolutely  necessary.  This  can  be  per¬ 
formed  with  simple  processing  elements  if  so-called  convergence  or  bit  algorithms  [3]  are 
used.  These  algorithms  have  already  successfully  been  used  on  a  massively  parallel  com¬ 
puter  like  the  DAP  (Distributed  Array  Processor).  Architecture  concepts  similar  to  that  of 
the  DAP  have  often  been  developed  for  digital  optical  and  optoelectronic  computing  [4],  [5]. 
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On  the  DAP  the  computation  time  for  the  above  mentioned  functions  was  in  the  range  of  1-3 
floating-point  multiplications  instead  of  10-12  if  a  ’'traditional"  calculation  scheme  as  Tay¬ 
lor  had  been  used  [6].  This  algorithm  class  is  well  suited  for  smart  pixels  because  only 
simple  operations  like  shifting,  conditional  addition  and  memory  access  to  predefined  bits 
are  necessary.  This  makes  a  simple  setup  of  the  smart  pixels  possible.  The  principle  of  a  bit 
algorithm  is  the  iterative  computation  of  a  pair  of  real  numbers  (x^,  y,).  The  iteration  process 
stops  if  Xj  has  reached  a  known  value.  Then  y^  became  automatically  the  desired  function 
value.  This  principle  is  shown  in  (1)  for  the  example  of  a  division. 


Y  _  Y  •  a  1  •  32  *  •  ■  ■  ■  a.n  _  q 
X  ”  X  •  ai  •  32  • ■  an 


X,.i  =X.-a:-Xi-(l+2“') 


X  •  3]  •  32  •  ■  an  — >  1 

Y  •  ai  •  32 ' •  an  Q 


X, .1=X, 

Y. .,  =Y. 


X.,1  >  1 


(1) 


By  multiplying  nominator  and  denominator  with  the  same  multiplying  factors  a-,  the 
resulting  denominator  converges  toward  unity,  and  the  resulting  numerator  converges  auto¬ 
matically  toward  the  desired  quotient.  The  multiplying  factors  a^  are  selected  in  such  a  way 
that  the  multiplication  is  reduced  to  shift  operation  and  addition.  Similar  methods  are  avail¬ 
able  for  the  computation  of  square  root,  reciprocal  values,  sine,  cosine,  logarithm  and  expo¬ 
nential  [3].  The  iterative  calculation  scheme  for  logarithm  and  exponential  function  shows 
(2). 


yn  =  e^ 

yo  =  1 ,  xo  =  x,  0  <  X  <  1 


yn  =  Inx 

yo  =  0,xo  =  x,0  <  X  <  1 


x, +i  =  Xi  -ln(l  +2“') 

y. +i  -yi  +  (yi  shr  i) 

Xi+l  =  Xi 
y.4-1  =  y, 


x..i=x.-ln(l+2-)  (2) 

yi+i  =y.  +  (yi  shr  i)  “ 


yi+i  =yi 


The  values  ln{l+2")  are  precalculated  constants.  Four  basic  operations  are  fundamental  for 
the  calculation  of  a  bit  algorithm:  addition,  multiplexing  to  select  the  right  operand  pair  for 
the  new  iteration,  check  the  condition  Xj,  j  <  0  and  x,,,  >  1  respectively,  and  access  to  mem- 
oty  which  stores  the  constant  values  and  the  data  shifted  to  the  right  about  i  bits  (operator 
shr  in  eq.  2). 


3.  A  3-D  smart  pixels  architecture  based  on  bit  algorithms 

Figure  1  displays  the  scheme  of  the  optoelectronic  3-D  architecture  based  on  bit  algorithms. 
SEL,  ADD  and  FLAG  are  processing  layers,  which  we  mapped  onto  smart  pixel  arrays. 
These  smart  pixel  arrays  perform  the  elementary  operations  of  the  bit  algorithms.  The  layer 
SEL  selects  the  right  operand  pair  {x-,  y,)  or  (x,,,,  y,  ,)  for  the  next  iteration.  The  layer  ADD 
works  as  a  1-bit  full  adder.  The  task  of  the  layer  FLAG  is  to  check  the  above  described 
conditions.  A  3-dimensional  organised  data  memory  (DATA)  is  connected  to  the  processing 
layers.  In  the  i'"  row  of  the  memory  the  i"'  bit  of  the  operand  is  shifted  out  first  to  realise  the 
nece.ssary  shift  operation  of  the  bit  algorithm. 
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after  3  cycles 


m+1 
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Figure  1:  Diagram  of  the  optoelectronic  3-D  architecture 


The  architecture  processes  data  bit-serial  and  word-parallel.  All  layers  operate  in 
pipeline  mode.  Each  row  of  the  3-D  architecture  contains  a  pipeline.  In  every  pipeline  one 
iteration  of  the  algorithm  is  executed.  If  m  is  the  number  of  the  mantissa  bits  the  results  ap¬ 
pear  at  the  top  of  the  3-D  memory  after  m+1  iterations.  If  the  architecture  is  completely 
filled  with  data  along  the  vertical  axis  the  results  are  produced  in  the  same  time  in  which 
one  iteration  is  executed.  Not  only  one  result  but  so  many  results  are  simultaneously  gener¬ 
ated  as  pipelines  or  rows  can  be  integrated  along  the  depth  of  the  3-D  system.  Hence,  each 
layer  contains  an  array  of  pipeline  stages.  The  maximum  number  of  parallel  pipelines  is  de¬ 
termined  by  the  most  complex  smart  pixels  array.  It  turned  out  that  this  was  the  adder  layer 
with  45  transistors,  one  flip-flop  and  2  optical  inputs  and  outputs  per  one  smart  pixel. 


4.  Specifications  for  the  hardware  requirements 

With  the  software  tool  HADLOP  [7],  a  hardware  description  language  that  we  especially 
developed  for  simulation  and  evaluation  of  digital  optical  and  optoelectronic  architectures, 
we  carried  out  simulations  of  the  architecture  described  above  to  find  out  specifications  for 
the  needed  hardware  requirements. 

Figure  2  shows  the  result  for  a  setup  of  the  architecture  in  which  the  smart  detectors 
and  sender  arrays  are  hybridly  integrated  on  a  silicon  waferboard  of  15x15  inch  size  and  the 
optical  interconnects  are  running  through  a  glass  plate  mounted  on  top  of  the  wafer.  The  left 
vertical  axis  shows  the  computing  performance  we  can  expect  for  the  processing  of  the  man¬ 
tissa  bits  versus  the  integration  density  in  the  smart  detector.  The  curves  correspond  to  vari¬ 
ous  delay  times  t^^,^  through  one  smart  pixel  array.  In  this  delay  time  two  gates  and  the 
optoelectronic  interfaces  at  the  input  and  output  side  have  to  be  passed.  The  right  vertical 
axis  shows  the  power  loss  in  the  microlaser  array  for  two  different  assumed  power  losses  in 
one  laser  diode.  If  we  assume  an  integration  density  of  7000  transistors/mm^  (state  of  the  art 
in  the  DEC  ALPHA  microprocessor)  and  a  1-2  ns  delay  time  for  one  smart  pixel  array  more 
than  200  GFLOPS  would  be  achievable.  Problems  arise  by  the  heat  transfer  of  the  laser 
diodes  that  would  be  12  W/cm^  if  ImW  dissipated  power  per  laser  diode  is  assumed. 

If  a  further  reduction  in  power  consumption  per  laser  diode  is  achievable  it  would  be 
possible  to  set  up  a  space-saving  optoelectronic  TFLOP  computing  system  on  only  5  wafer 
boards.  Then  probably  the  heat  problem  lies  more  in  the  power  dissipation  of  the  smart  de¬ 
tector  if  data  rates  up  to  1  GHz  are  intended.  With  sophisticated  cooling  techniques  this 
problem  should  be  solvable. 
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Figure  2:  Performance  analysis  for  massively  parallel  smart  pixels  architecture 


5.  Conclusion  and  future  outlook 

We  presented  a  concept  for  a  powerful  architecture  for  an  optoelectronic  massively  parallel 
computer.  If  the  hardware  parameters  we  specified  can  be  fulfilled  it  is  to  expect  that  the 
performance  of  such  an  architecture  is  higher  than  in  pure  electronic  parallel  systems.  This 
will  not  only  possible  by  better  devices  but  also  by  the  right  combination  of  algorithm, 
architecture  and  intensive  exploitation  of  optical  interconnects,  as  we  have  shown  it  here. 

The  result  of  our  performance  estimation  to  first  order  shows  that  it  is  worthwhile  to 
carry  out  further  investigations  towards  general  purpose  architectures.  Our  smart  pixel  needs 
only  a  small  area.  Hence,  there  is  enough  space  to  increase  the  complexity  towards  program¬ 
mability  and  larger  local  storage  capacity.  This  will  lead  to  3-D  computers  with  a  great  com¬ 
munication  potential  because  of  optical  interconnects.  Such  architectures  are  especially 
suited  for  solving  3-D  problems  like  for  example  parallel  volume  rendering,  processes  in 
fluid  mechanics  and  computer  simulations  using  particle  models.  These  are  possible  applica¬ 
tion  fields  for  smart  pixel  computers  we  will  investigate  in  the  future.  For  an  efficient  map¬ 
ping  onto  real  smart  pixels  systems  a  close  co-operation  between  device  technologists  and 
computer  architects  is  necessary  to  lead  optoelectronic  computing  to  success. 
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Abstract.  An  approach  of  ’’transverse  interconnection  optical  processor 
architecture”  is  proposed  which  allows  flexible  optically  programmable  data 
exchange  between  individual  beam  channels  without  changing  their  relative 
dispositions. 


1.  Introduction 

Using  the  light  as  information  carrier  implies  usually  that  the  direction,  in  which  data 
are  transported  in  free  space  or  in  waveguide,  coincides  with  the  light  beam  direction. 
However,  there  is  a  number  of  nonlinear  optical  phenomena  such  as  transverse  effects 
in  optical  bistability  (OB)  that  allow  to  direct  the  information  flow  perpendicularly  to 
the  light  beam.  When  studying  such  nonlinear  transverse  phenomena  it  has  been  found 
that  there  are  different  steady  profiles  of  the  output  beam/image  intensity  for  the  same 
distribution  of  input  light  interacting  with  a  bistable  interference  layer.  The  selection  of 
a  desired  profile  as  well  as  transitions  between  them  can  be  easily  controlled  optically. 
It  gives  a  basis,  for  instance,  for  all-optical  implementation  of  2D-data  shift.  In  this 
method  all  the  stages  of  the  information  transfer  are  performed  in  the  plane  of  matrix 
of  optical  elements.  It  provides  an  opportunity  for  organizing  the  interconnections  and 
information  exchange  between  neighbour  logic  elements/pixels  within  2D-array  leading 
to  new  optical  computer  architectures. 


2.  Basic  method 

In  this  paper  the  method  of  ’’transverse  lock-and-clock”  processing  [1]  is  extended  to  a 
’’planar  -  free  space”  optical  interconnections  and  circuits.  The  basic  idea  of  the 
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Fig.  1.  Base  element  of  the  shift  processor  with  longitudinal  Sin/Soat 
transverse  DinjDo^it  input/otput 

"planar  -  free  space”  architecture  is  to  combine  the  parallel  data  transfer  between  sequen¬ 
tially  located  (within  loop  parallel  processor)  matrices  of  switching/bistable  elements  and 
planar  transverse  interconnections  and  processing  within  the  plane  of  each  matrix. 

In  such  an  approach  an  ensemble  of  light  beams  (a  digital  image)  preserves  the 
topology  of  their  relative  disposition  (with  no  shuffle,  etc.)  when  propagating  in  free 
space  between  two  sequentially  located  nonlinear  matrices,  in  contrast  to  other  flexible 
interconnection  architectures.  The  flexibility  of  the  information  connections  between  dif¬ 
ferent  parts  of  the  image  is  reached  due  to  the  optically  programmable  transverse  trans¬ 
port  and  redistribution  of  the  different  logical  states  of  transmission/reflection  within 
the  matrices  of  nonlinear  elements.  As  a  result,  the  data  in  different  individual  channels 
are  exchanged  without  changing  the  spatial  positions  of  the  beams  that  carry  the  data 
along  the  processor  loop. 

The  basic  element  of  such  shift  processor  is  a  pair  of  coupled  neighbour  optical  cells 
(pixels)  shown  in  Fig.  1.  In  comparison  with  traditional  OB— elements,  here  arises  an 
additional  channel  for  input  of  information  into  a  pixel  of  2D  optical  processing  matrix 
due  to  data  transfer  from  neighbour  pixels. 

The  main  features  of  the  shift  processor  are: 

*  possibility  of  transverse  data  shift  simultaneously  in  counter  directions 
without  interaction  between  the  counter  propagating  data  flows  (  trans¬ 
parent  mode”); 

*  logical  operations  with  the  neighbouring  digits  of  2D  matrix; 

*  input  and  storage  of  several  2D  data  pages  as  well  as  of  interim  results  of 
logical  operations  in  the  same  OB-plate; 

*  possibility  to  design  multilayer  processing  structures  with  the  lateral 
inpiit/output. 


3.  Design  and  experimental  results 

The  schematic  layouts  of  all-optical  serial-parallel  and  parallel-serial  data  converters, 
stack  memory,  multiplexer-demultiplexer,  non-blocking  crossbar,  planar  loop  circuits 
and  regular  networks  are  given  in  Fig.  2.  Experimental  modeling  of  data  converters  and 
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Fig.  2.  Optical  ’’planar  -  free  space”  logic  circuits 

stack  memory  has  been  performed  [1]  using  2D  bistable  thin-film  Fabry-Perot  interfer¬ 
ometer  [2]  with  optical  aperture  of  36  mm. 

For  practical  realizations  of  ’’planar  -  free  space”  circuits  OB-matrices  based  on 
media  with  electron  nonlinear  mechanisms  are  more  promising  from  the  view  point  of 
operation  rate.  We  report  here  preliminary  experimental  results  on  detecting  switching 
waves  and  measuring  their  speed  in  one  of  such  media.  In  detail  these  results  will  be 
presented  later  in  a  separate  publication. 

All-epitaxial  GaAs/GaAlAs  optically  bistable  Fabry-Perot  etalons  were  used  that 
are  similar  to  those  described  in  [3].  Switch-on/ofF  time  for  single  pixel  has  been  mea- 
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sured  as  --  40  ns  [3].  Nonlinear  changes  in  our  experiments  were  monitored  in  reflected 
beam.  The  pulsed  generation  of  a  tunable  color-center  laser  at  a  wavelength  891  nm 
in  the  vicinity  of  interference  maximum  was  focused  on  the  etalon’s  surface  into  the 
spot  with  diameter  of  about  200  /xm.  By  measuring  local  intensities  of  reflected  light 
in  different  points  along  the  beam  radius  (similar  to  the  technique  [4])  we  estimated 
the  speed  of  switching  wave  in  the  on- axis  area  of  the  beam  as  ^  1  ^mjns.  Therefore, 
one  can  expect  that  in  such  OB-etalons  parallel  transverse  shift  of  information  between 
neighbour  pixels  could  take  about  10-50  ns. 

For  media  with  picosecond  relaxation  times  this  estimation  can  be  smaller  by  2-3 
orders  of  magnitude.  In  this  case  the  fraction  of  cycle  time  required  for  transverse 
modification  (patterning)  in  OB-matrix  of  parallel  loop  processor  will  be  comparable 
with  that  required  for  longitudinal  transfer  of  information  between  OB  matrices. 


4.  Conclusion 

Approach  to  architecture  design  is  proposed  which  combines  parallel  free  space  data 
transfer  between  sequentially  located  (within  the  optical  loop  parallel  processor)  OB  - 
matrices  with  planar  transverse  interconnections  and  data  processing  within  the  plane 
of  each  matrix  using  transverse  effects  in  optical  bistability.  Some  simple  ’’planar  -  free 
space”  circuits  are  modelled.  Experimental  estimations  for  speed  of  switching  waves  in 
all-epitaxial  GaAs/GaAlAs  etalons  are  made. 
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Abstract.  Operation  of  shift  register  based  on  propagation  of  switching 
waves  in  distributed  nonlinear  media  is  studied  by  numerical  simulation. 


1.  Introduction 

The  concept  of  lock- and- clock  architecture  for  optical  data  processing  has  been  recently 
modified  to  take  into  account  two  transverse  degrees  of  freedom  and  all-optical  shift 
register  based  on  this  concept  has  been  demonstrated  experimentally  [1].  Information 
shift  in  the  register  takes  place  in  the  plane  of  matrix  of  optical  bistable  elements.  For 
further  development  of  this  concept  an  adequate  theoretical  description  would  be  useful. 


2.  Theoretical  model 

In  present  contribution  we  describe  theoretical  model  of  planar  all-optical  chip  consisting 
of  an  array  of  bistable  elements  that  can  be  coupled  using  transverse  interconnections. 
Analysis  of  the  model  is  based  on  solving  coupled  equations  for  light  field  in  optically 
nonlinear  spacer  of  Fabry-Perot  interferometer  and  equations  that  describe  phenomeno¬ 
logically  diffusion  of  medium’s  nonlinear  parameters.  In  general  form  these  equations 
can  be  written  as 

dt  ~  Ui Qi[Iin[r^t)F[Uj)yUj] piCi—  i,j  =  l,2, ...  (1) 

(1  —  -f  sin^{7r[e:  -j- 

where  Ui  -  are  parameters  of  the  medium  which  contribute  to  nonlinear  change  of  refrac¬ 
tive  index,  for  instance,  carrier  density  (electrons,  holes,  excitons),  temperature,  pressure, 
etc.;  -  Laplacian;  pi,  a,  Ki  -  physical  characteristics  of  the  medium,  in  particular, 
Ki  determine  diffusion  of  correspondent  parameters  Ui\  Qi  -  some  source  functions  for 
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Ui\  incident  light  intensity  distribution;  F  -  function  giving  the  relationship  of 

light  field  within  the  spacer  of  Fabry-Perot  interferometer  and  incident  light  field;  n  - 
time  constants  describing  the  relaxation  of  C/,-;  R  -  power  reflectivity  of  mirrors;  a  - 
absorption  factor;  h  -  thickness  of  spacer;  m  -  order  of  interference;  e  -  initial  detuning; 
no  -  refractive  index;  Sui  -  change  of  refractive  index  associated  with  change  of  Ui. 

Initial  and  boundary  conditions  are  adequate  to  those  in  typical  experimental  re¬ 
alizations  of  OB-layers. 

The  main  assumption  of  this  theoretical  model  is  that  all  changes  of  medium  s 
parameters  Ui  contributing  to  optical  nonlinearity  are  much  slower  than  light  field  build¬ 
up  inside  interferometer.  This  statement  is  quite  obvious,  since  for  the  spacer  of  lst-3rd 
order  of  interference  as  in  calculations  internal  light  field  reaches  its  steady-state  value 
in  ~  10"^^  Dealing  with  thin-base  Fabry-Perot  interferometers  one  may  also 

consider  that  change  of  refractive  index  does  not  depend  on  longitudinal  space  variable 
and  is  determined  by  mean  values  of  parameters  Ui  averaged  over  the  thickness  of  spacer. 

Since  the  model  is  applied  to  numerical  simulation  of  real  experimental  conditions 
where  thin-film  interferometers  (TFI)  with  Fresnel  number  of  about  10^  are  used,  diffrac¬ 
tion  is  negligible  and  therefore  not  taken  into  account. 

The  approach  can  deal  with  various  types  of  nonlinearity,  such  as  band-filling  or 
nonlinearity  due  to  free  carrier  generation.  For  each  case  one  have  just  to  define  correctly 
the  functions  Qi  which  are  responsible  for  the  rise  of  nonlinearity. 

In  particular,  on  the  basis  of  the  above  model  we  have  performed  numerical  simu¬ 
lations  for  nonlinear  thin-film  interferometers  [2]  where  nonlinear  mechanism  is  thermo- 
optical  effect.  In  this  specific  case  U  is  temperature  of  the  spacer  and  thermo-induced 
change  of  its  refractive  index  can  be  found  by  averaging  over  the  thickness  of  spacer. 

h 

SnriT)  =  ^l  T{z)dz 
0 

Function  Q  has  then  physical  sense  of  light  power  absorbed  in  TFI  and  appearing  as 
heat,  K  is  thermal  conductivity,  p  -  density,  c  -  specific  heat,  etc.. 


3.  Results  and  discussion 


The  model  allows  to  study  dynamics  of  light  intensity  redistribution  in  2D  spread  OB 
plates  including  well-known  transverse  effects  in  OB  such  as  the  emergence  and  propa¬ 
gation  of  switching  waves,  formation  of  steady  profiles,  etc.  For  example,  Fig.  1  gives 
evidence  of  switching  waves  for  uniform  distribution  of  input  intensity.  The  waves  are 


Fig.  1.  Switching  waves  in  spread  homo¬ 
geneous  OB-layer  at  uniform  distribution 
of  holding  input  intensity. 


Output  Intensity, 


Input  Intensity, 


Fig.  2.  Spatial  and  temporal  dynamics  of  shift  register  during  operation  cycle,  (a,  c)  - 
output  and  (6,  d)  -  corresponding  input  intensity  distributions  along  the  register.  Upper 
plots  show  ”1”— »”0”  shift  operation,  lower  ones  -  ”0”— »”1”.  Arrows  on  plots  for  input 
intensity  distributions  point  to  the  stage  of  recording  initial  information  into  main  pixels. 

triggered  by  local  rises  of  input  intensity  near  the  edges  of  uniform  holding  beam.  Lo¬ 
cal  drop  of  input  intensity  in  its  turn  leads  to  stoppage  of  waves’  fronts  and  prevents 
switching  waves  from  further  spreading.  Namely  this  circumstance  allows  to  isolate  the 
information  states  of  neighbour  OB-pixels  in  ’’transverse  lock- and- clock  architecture”  de¬ 
vices.  Speed  of  stationary  switching  wave  for  the  particular  conditions  is  ~  1.3 

Nonuniform  and  time-dependent  distribution  of  input  intensity  results  in  more 
complicated  behaviour  of  the  system.  In  particular,  using  the  described  theoretical 
model  we  have  performed  numerical  simulation  of  optical  data  shift  in  the  simplest  4- 
pixel  shift  register  studied  experimentally  in  [1].  Fig.  2  shows  the  evolution  of  input 
and  output  intensity  distributions  along  the  register.  Input  intensity  is  a  combination 
of  4  independently  controlled  gaussian  beams.  Notation  of  pixels  is  the  same  as  in  [1]. 
From  Fig.  2, a  one  can  see  that  ”l”-state  of  source  pixel  S  is  shifted  through  intermediate 
transfer  pixels  T1  and  T2  to  destination  pixel  D  by  means  of  corresponding  intensity 
modulation  in  all  gaussian  beams. 

If  pixel  S  is  initially  in  ”0”-state,  the  same  modulation  sequence  shifts  this  informa¬ 
tion  bit  to  pixel  D  which  is  in  ”l”-state  at  the  beginning  of  shift  cycle  (Fig.  2,c).  Ail  other 
possible  combinations  of  binary  data  stored  in  pixels  S  and  D  have  also  been  subjected 
to  rightward  shift  operation.  Leftward  shift  has  been  performed  as  well  by  exchanging 
modulation  sequences  of  intermediate  pixels  T1  and  T2.  For  the  shown  disposition  of 
pixels  (radius  of  each  gaussian  beam  10  fim,  distance  between  them  32  fim)  the  entire 
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Fig.  3.  Dynamics  of  output  (a)  and  input  (5)  intensity  profiles  along  the  shift  register 
during  data  transfer  for  the  case  of  sharper  focusing  of  input  gaussian  beams. 

shift  cycle  takes  about  200  fis.  This  time  depends  obviously  on  size  of  pixels  and  it  is 
possible  to  achieve  faster  operation  by  placing  pixels  closer  to  each  other. 

Fig.  3  shows  all  stages  of  shift  cycle  ”1”  ”0”  for  more  sharply  focused  input 

beams  (radius  of  3  fim),  distances  are  made  smaller  by  the  same 'ratio.  In  this  arrange¬ 
ment  rise  of  temperature  in  pixel  S  can  directly  influence  the  information  state  of  pixel 
D,  therefore  range  of  intensity  modulation  in  all  pixels  should  be  also  changed  to  provide 
the  correct  operation  of  shift  register.  However  when  the  proper  parameters  of  input 
intensities  are  found,  shift  cycle  can  be  reduced  to  ~  30^3,  exhibiting  nearly  square 
dependence  on  pixel’s  size. 

By  introducing  additional  gaussian  beams  into  the  input  intensity  distribution, 
one  can  translate  the  simplest  4-pixel  shifting  chain  along  transverse  dimension.  Linear 
circuits  of  pixels  arranged  in  such  a  way  may  be  regarded  as  multi-digit  shift  registers. 
Their  operation  as  well  as  behaviour  of  more  complicated  planar  circuits  can  be  also 
studied  numerically  on  the  basis  of  the  above  theoretical  model. 
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Abstract.  New  abilities  of  image  processing  appearing  from  the  presence  of 
controlled  local  connections  of  an  inhibitory  type  between  cells  in  an 
optoelectronic  trigger  medium  are  discussed. 


1.  Introduction 

An  optoelectronic  memory  usually  consists  of  independent  cells  with  optical  inputs.  We 
consider  new  abilities  of  optoelectronic  memory  arising  from  the  presence  of  local 
connections  between  cells.  The  memory  medium  is  able  to  extract  (process)  the  informative 
features  of  recorded  images  when  we  use  specific  types  of  connections.  Resolution  and 
memory  capacity  are  determined  by  the  microelectronic  technology  possibilities.  We  will  refer 
to  a  memory  medium  with  connected  cells  as  MCC.  The  possibilities  of  creating  solid-  state 
continual  media  and  information  processing  by  them  are  discussed  in  [1-3].  Microelectronics 
analogue  realizations  of  homogeneous  media  of  active  elements  with  local  connections,  which 
were  called  Cellular  Neural  Networks  (CNN),  were  proposed  in  [4], 


2.  Design  of  the  cell 

Consider  the  binary  optoelectronic  memory  shown  overpage.  Every  cell  comprises  a  trigger  T 
with  an  optical  input  and  output.  The  image  projected  onto  the  medium  is  detected  by 
photocell  (C),  and  is  recorded  by  trigger  flip-  over  from  state  Tx  to  state  T2  (ri-->r2).  The 
circuits  (photocells)  for  reading  {R)  and  erasing  {D)  are  switched  off  during  recording. 
Uniform  illumination  of  all  R  turns  on  light  emitters  {E)  of  all  cells  where  trigger  T  is  in  the 
state  T2,  the  photocells  C  and  D  are  turned  off.  It  restores  recorded  images.  The  picture  is 
destroyed  by  flip-  over  72^7j  by  uniform  illumination  of  all  D. 

Every  cell  is  connected  with  its  nearest  eight  neighbors  (here  we  discuss  the  square 
array).  The  connections  are  of  inhibitory  type;  they  block  up  connected  cells,  i.e.  prevent 
T2  flip-  over.  A  special  threshold  element  (7£)  (coupled  with  D  to  the  trigger  arm  T2) 
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combines  all  connections  from  neigbbouring  cells.  The  threshold  control  unit  (7’C)  sets  the 
number  of  connections  (AO  necessary  for  a  signal  to  appear  at  the  TE  output.  This  signal 
switches  the  memory  cell  off  {Ti-^Tx)  in  the  same  way  as  the  D  signal  does. 

The  system  also  includes  a  connection  distributor  (CD)  and  light  emitter  control  unit 
{EQ.  When  E  is  turned  on,  after  the  delay  time  the  ES  unit  switches  the  E  off  for  the  whole 
image  analysis  cycle. 

The  main  feature  of  MCC  with  the  inhibitory  connections  is  that  the  memory  state  of  the 
chosen  cell  depends  only  on  the  number  of  illuminated  neighbour  cells  which  prevent  Tx-^T^ 
flip-  over  by  their  connections.  Of  course,  there  is  also  a  reverse  inhibitory  effect  of  the  chosen 
cell  on  its  neighbors. 

There  are  the  obvious  temporal  restrictions:  if  ti  is  the  internal  time  of  the  whole  cell, 
is  the  time  of  connecting  of  cells,  and  T3  is  the  characteristic  time  of  the  image  changing,  then 

Ii«T2«X3. 


3.  Image  processing  by  medium  with  local  connections 

We  assume  that  the  minimal  size  of  a  picture  element  is  ot  the  same  order  as  of  a  cell  size  and 
the  distance  between  cells  is  substantially  smaller  than  the  linear  cell  size. 

When  N  =  1,  the  1110111017  records  only  isolated  elements  of  a  size  of  the  order  of  a  cell, 
with  at  least  a  one  cell  spacing.  It  is  not  possible  to  record  other  picture  elements,  because 
even  one  inhibitory  connection  prevents  such  a  process.  The  addition  of  a  new  neighbouring 
illuminated  element  to  the  one-  cell  spot  leads  to  local  information  destmction. 

When  N  =  2  the  memory  extracts  both  1-  cell  and  arbitrary  2-  cell  spots.  IfN  =  3  it  is  also 
possible  to  record  3-  ceil  spots  and  arbitrary  lines  of  a  one  cell  thickness,  when  the 
neighbouring  elements  are  connected  only  by  two  inhibitory  connections,  With  the  increasing 
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of  N  the  minimal  size  of  recorded  spots  is  increased,  and  cells  in  the  off  state  can  appear  inside 
illuminated  spots. 

Large-  pattern  contours  can  be  extracted  when  N  =  5;  inner  cells  are  inhibited,  whereas 
contour  cells  (which  have  no  illuminated  neighbours  outside  the  pattern)  are  not  inhibited. 

Analysis  of  pictures,  maps  etc.  is  best  carried  out  by  "step  by  step"  cycle  increasing  N 
from  1  to  8  with  exclusion  of  fragments  extracted  at  the  (N  -  1)  th  step.  In  fact  MCC  can  be 
used  for  decomposition  of  the  initial  picture  into  fragments:  small  spots,  contours  of  large 
spots,  lines,  line  crossings,  angles,  etc. 


4,  Extraction  of  moving  fragments 

An  important  possible  application  is  extraction  of  moving  fragments.  The  initial  picture  is 
projected  onto  the  MCC  and  then  all  corresponding  E  are  turned  off  by  unit  ES.  Then  the 
initial  picture  with  shifted  elements  is  projected  again.  Obviously,  only  emitters  of  the  cells 
corresponding  to  recorded  shifted  elements  are  turned  on.  The  subsequent  projecting 
operations  make  it  possible  to  determine  the  directions  and  velocities  of  moving  elements. 
This  method  is  also  suitable  for  extraction  of  motionless  but  distinct  parts  of  patterns. 


5.  Possible  development 

The  possible  applications  of  MCC  are  expanded  if  every  cell  is  connected  not  only  to  the 
nearest  cells  or  not  only  by  inliibitory  connections  or  if  not  only  square  arrays  are  discussed. 
Activating  excitatory  connections  (with  inhibitory  connections  switched  off)  for  a  certain  time 
it  is  possible  to  enlarge  the  size  of  extracted  spots  and  the  width  of  lines,  and  to  spline 
indented  edges. 
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Abstract.  A  system  of  oscillators  with  optical  inputs  and  outputs  transforms  a 
projected  image  into  spatial  distribution  of  oscillator  frequencies.  Resonant 
excitation  of  different  groups  of  oscillators  is  discussed  as  a  new  method  of 
image  processing. 


1.  General  idea 

The  main  idea  of  the  discussed  approach  is  the  following  [1,2].  Suppose  there  is  a  matrix  of 
identical  quasiharmonic  noninteracting  oscillators  with  optical  inputs  and  outputs.  The  image 
l(X,Y)  is  projected  onto  the  matrix  and  it  leads  to  the  change  of  oscillators'  frequencies  coo 


cao=>a),o,  =cOo+al(X,Y) 

Hence  the  image  is  encoded  into  the  spatial  distribution 
of  frequencies. 

Then  the  homogeneous  pumping  is  projected  onto  the 
medium  through  a  wide  aperture  modulator  with  a  harmonic 
regime  of  modulation  (frequency  Q).  The  pumping  excites 
only  such  parts  of  the  medium  which  correspond  to  the 
condition:  (0u,c(X,Y)  «  Q.  The  parametric  resonance  can  be 
used  by  the  same  way. 

Our  results  do  not  depend  on  concrete  model  which 
describes  the  oscillators.  Hereinafter  we  use  the  Van-  der- 
Pol  like  model  in  the  regime  of  quasi  harmonic  damping 
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osciilations.  Variable  K  describes  the  amplitude  of  the  medium  response. 

The  parametric  excitation  of  one  dimensional  medium  with  a  gradient  I(X)  (Fig.l,a)  by 
two  frequency  pumping  illustrates  our  method  in  Fig. l,b.  The  widths  of  resonant  stripes 
depend  on  Q  as  well  known  in  the  theory  of  parametric  resonance.  Varying  Q  we  can  provide 
the  extraction  of  regions  of  equal  intensity.  If  oscillators'  inputs  are  sensitive  with  respect  to 
the  light  wavelength  it  is  possible  to  extract  regions  of  certain  color. 

Example  of  a  problem  which  can  be  solved  by  the  proposed  method.  A  sheet  of  paper 
contains  a  set  of  patterns  drawn  with  different  intensities  (or  colors).  Projecting  the  sheet 
onto  the  medium  and  using  a  set  of  appropriate  Q  it  is  possible  to  extract  every  pattern. 


2.  Abilities  of  resonance  method 

We  have  studied  different  modifications  whereas  the  additional  elements  are  used:  passage  of 
analyzing  pictures  through  a  modulator,  periodic  shift  of  picture  along  the  medium,  using  of 
the  inhomogeneous  background  light  pumping.  Such  complications  allows  us  to  realize  the 
following  procedures:  extraction  of  certain  parts  of  picture,  removing  small  scale  distortions 
from  patterns,  extraction  of  contours,  points  of  extrema,  correlative  comparison  of  images; 
determination  of  common  and  distinct  parts  of  images,  moving  elements,  etc. 

Below  several  possibilities  are  discussed  in  details. 

2. 1.  ExJraction  of  common  and  distinct  parts  of  images 

Two  binaiy  pictures  are  projected  onto  the  medium  (onto  the  same  place)  In  the  common 
parts  we  have  co„,  =  (o^  +  20^7  ,  in  the  distinct  parts-  C0o,2  ==  COq  +  CtJ ,  J  is  unity  level.  By 
resonance  excitation  one  can  extract  the  common  parts  of  images  if  =  o>o,i,  when  Q  -  coo,2 
the  distinct  parts  may  be  excited. 

2.2.  Extraction  of  image  contour  and  moving  fragments 

With  projecting  the  image  and  its  copy  enlarged  a  little  by  optical  system  and  adjusting  to 
resonance  at  H  =  a>o.2  it  is  possible  to  extract  the  contour.  The  method  can  be  generalized 
easily  for  analysis  of  gray-  toned  images  and  moving  fragments.  In  the  latter  case  one  has  to 
project  the  same  picture  onto  the  medium  in  a  set  of  dilferent  moments  and  use  resonance 
excitation  for  the  extraction  of  shifted  elements  only. 

2.3.  Halftoning 


If  Q  is  chosen  in  such  a  way  that  the  medium  extracts  only  the  most  intensive  parts  (peak  1  in 
Fig.  2, a)  the  image  encodes  into  a  binary  one  (Fig.2,b).  For  Q  corresponding  to  the  resonance 
with  peak  2  the  result  of  binarization  is  shown  in  Fig.  2,c. 
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2. 4.  Extraction  of  extrema 


A  one  dimensional  image  S  is  projected  onto  the 
medium  through  a  modulator  and  simultaneously 
the  projection  oscillates  in  space  along  the 
medium  (Fig.  3,a).  There  is  a  phase  shift  %I2 
between  two  types  of  oscillations  and  their 
frequencies  are  equal.  It  is  possible  to 
demonstrate  that  in  Van-  der-  Pol  like  model  the 
amplitude  of  medium  response  is  proportional  to 

4^ .  In  the  extrema  4^  =  0  and  amplitude  is 
dx  dx 

equal  to  zero  too,  therefore  these  points  may  be 

extracted  The  result  of  modeling  is  shown  in 

Fig.  3,b.  Analysis  of  two  dimensional  functions 

I(X,Y)  is  possible  by  the  same  way  changing  the 

direction  of  the  picture  tremble  step  by  step 

followingly. 


3.  Inuige  processing  and  self  sustained  oscillations  (extraction  of  images  from  dynamic 
noise) 

Also  it  is  possible  to  use  non  resonant  methods  for  image  processing.  Consider  unconnected 
oscillators  in  a  self-  oscillating  regime.  For  simplicity  we  discuss  the  one-  dimensional  case. 
The  oscillatory  medium  is  excited  by  a  weak  binary  image  I(X)  and  by  intensive  noise 
fluctuations  (X,t)  (Fig.  4,a).  For  many  periods  of  self-  oscillations  the  image  I(X)  (which 
locally  changes  the  frequency  (Oo)  gives  a  large  accumulation  of  inhomogeneous  local  phase 
shifts  A(p(X).  The  noiwse  gives  local  frequency  shifts  of  different  signs  and  can  be  fully 
averaged  during  many  periods.  Therefore  in  the  medium  response  noise  can  produce  only  a 
constant  homogeneous  phase  shift  determined  by  the  statistical  properties  of  the  noise.  The 
result  of  numerical  modeling  for  the  certain  moment  (when  A(p(X)  ~  ?:)  is  shown  in  Fig.  4,b, 
After  the  information  signal  1(X)  is  switched  off,  the  medium  becomes  a  memory  since  the 
self-  oscillations  do  not  damp.  Very  weak  signals  1(X)  can  be  extracted  by  this  method  of 
phase  accumulation  but  the  time  of  extraction  becomes  vei'y  large.  We  modeled  also  image 
extraction  by  a  medium  of  self-  oscillators  connected  by  diffusion.  When  the  image  1(X)  was 
switched  on  for  a  long  time,  the  local  phase  shifts  became  spread  and  spatio-  temporal 
oscillating  halo  around  the  image  were  formed  (Fig.  4,c).  This  feature  can  be  useful  for  some 
applications,  for  example,  for  making  an  enlargement  of  small  isolated  picture  elements.  After 
switching  the  light  off  the  phase  distribution  relaxes  due  to  difflision,  All  mentioned  earlier  is 
also  valid  for  two  dimensional  case. 
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4.  Realization  of  media  of  oscillating  elements 

Media  of  oscillating  elements  can  be  realized  in  several  ways. 

a)  Matrices  of  microcavity  lasers,  There  are  structures  in  which  every  microcavity  laser  is 
integrated  with  a  photocell  for  noncoherent  light  [3],  Informative  light  flow  is  percepted  by 
the  photocells  and  changes  the  frequency  Oo  of  laser  electron-  photon  resonance  (tOo  ~10^  s'^). 
Existing  electrooptical  modulators  can  excite  this  resonance.  Matrices  of  microcavity  lasers 
without  photocells  can  also  be  used  when  the  coherent  informative  light  flow  is  used. 

b)  Matrices  of  passive  optical  bistable  elements  in  regime  of  optical  oscillations  [4]  can  be 
used  in  the  same  way. 

c)  Analog  microelectronic  matrices  containing  oscillating  circuits  with  optical  inputs  and 
outputs  are  convenient  too.  For  example,  circuit  which  converts  light  intensity  to  signal 
frequency,  is  discussed  in  [5], 
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Abstract.  A  3-D  polarization-optical  stacked  integration  technique  is  developed,  by  which 
optical  processors  are  packaged  to  solid-state  modules  from  the  cascading  of  building  blocks 
of  birefringent  crystals  and  spatial  light  modulators.  Designs  and  constructions  of  the  modules 
of  morphological  processor,  self-programming  logie  processor,  beam  array  generator,  and 
their  cascaded  combination  to  cellular  logic  image  processor  are  demonstrated. 


1.  Introduction 

Free-space  optics  offers  the  potential  for  building  optical  processors.  For  the 
practical  use,  the  packaging  of  optical  components  and  devices  into  functional 
compact  solid-state  modules  is  necessary.  Optical  interconnects  in  the  free  space 
can  be  implemented  by  geometric  optics[l],  diffractive  optics[2],  and  micro- 
optics[3].  There  were  many  suggestions  for  compact  packaging.  However,  each 
packaging  technique  assembles  the  modules  for  a  special-purpose  application.  So, 
it  is  interesting  to  develop  a  universary  packaging  technique  to  assemble 
processor  from  unique  building  blocks. 

In  this  paper,  we  propose  a  new  packaging  technique  that  to  fabricate 
functional  building  blocks  of  polarization-optical  components  for  interconnection 
and  of  devices  for  processing  and  interface,  then  to  assemble  compact  solid-state 
modules  of  processors  by  stacking  different  building  blocks,  and,  further,  to 
construct  complex  modules  by  cascading  processor  modules.  The  designs  and 
experiments  of  a  cellular  two-layer  logic  image  processor  module  is  shown, 
which  is  a  cascaded  integration  of  the  modules  of  morphological  processor,  self¬ 
programming  logic  processor,  and  beam  array  generator.  The  features  are:  (1) 
The  crystal  building  blocks  are  simple  plates.  (2)  The  stacked  modules  are 
compact  and  insensitive  to  environment.  (3)  Calcite  and  other  crystals  have  high 
extinction  and  transmission  radios.  (4)  The  interference  is  minimized  with 
orthogonal  polarizations.  (5)  The  alignment  of  crystal  blocks  is  not  critical.  (6) 
Cascaded  stacking  of  processor  modules  is  possible. 


122 


2.  Basic  building  blocks 

Calcite  plates,  quartz  polarization  rotators,  waveplates,  and  polarizers  are  used 
as  the  building  blocks  for  optical  interconnection.  The  function  of  a  calcite  plate 
for  beam  splitting  or  combination  and  of  a  quartz  rotator  are  seen  in  Fig.l.  Using 
calcite,  the  maximum  packaging  density  is  about  30(spots/mm).  Solid-state 
electro-optical  SLMs  with  electric  dual-rail  addressing  are  developed  as  shown 
in  Fig.2.  The  other  SLM  used  is  the  PROM  which  can  be  used  as  either  optical 
interface  device  or  thresholding  device. 


Figure  1  Figure  2 


3.  Architecture  of  cellular  two-layer  logic  image  processor 

Cellular  logic  image  processors[4]  have  become  an  important  architecture  for 
optical  computing[5-8].  Based  on  mathematical  morphology [9],  we  have 
proposed  an  one-operation  image  algebra  and  a  cellular  two-layer  logic  binary 
image  processor[10].  All  kinds  of  binary  morphological  processing  and  pattern- 
recognition  functions  can  be  realized  simply  by  the  logic  programming  in  an 
iterative  mode.  By  using  threshold  decomposition  and  sum-superposition  various 
morphological  gray-tone  image  processing  functions  can  be  achieved  in  a  double 
iterative  mode[ll].  Fig.3  shows  the  architecture.  The  elementary  CPU  is  a 
cellular  two-layer  logic  binary  image  processor  which  consists  of  a  parallel 
binary  logic  processor  followed  by  a  morphological  binary  dilation  processor. 
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4.  Optical  morphological  processor  module 

The  morphological  processor  model  is  shown  in  Fig.4,  which  consists  of  three 
calcite  plates  of  the  thickness  ration  l:ln/2.  An  input  binary  image  will  be 
splitted  into  four  images.  This  is  equivalent  to  a  structuring  element  of 
{0,1;1,0;0,-1;-1,0}.  A  thresholding  level  of  1  or  4  results  in  dilation  or  erosion. 


5.  Module  of  self-programming  logic  processor 

Based  on  dual-rail  coding  of  multiple  variables,  we  suggest  the  self-programming 
logic.  The  two  coded  patterns  are  a  pair  of  an  image  and  its  negation  (Bj  and  B2) 
and  a  pair  of  programming  patterns  (Xj  and  X2),  which  may  be  the  other  image, 
its  negation,  a  bright  background,  or  a  dark  background.  The  different  access  of 
Xi  and  X2  leads  to  all  the  sixteen  logic  operations.  Fig. 5  shows  the  module.  A 
PROM  is  used  to  code  an  input  image  into  a  collimated  pattern  with  dual-rail 
format.  Two  dual-rail  SLMs  are  used  to  enter  Xj  and  X2.  The  two  accessing 
beams  pass  sequentially  through  the  two  SLMs,  and  then  combined  together  by 
the  calcite  plate  CP2. 

6.  Array  illuminator  module 

A  fundamental  stage  consists  of  two  identical  calcite  plates  but  with  the  optical 
axes  orthogonal,  and  an  incident  beam  will  become  2x2  beams.  Thus,  N  stages 
with  multiplied  thickness  of  calcite  can  produce  an  array  of  2Nx2N  beams. 

7.  Module  of  cellular  two-layer  logic  image  processor 

Fig.6  gives  the  configuration  of  the  solid-state  system  module  assembled  by  the 
cascaded  stacking  of  the  individual  modules  and  beam  splitters.  An  array  of 
photodetector  converts  the  2-D  light  signals  to  electrical  signals.  A  PC  computer 
executes  the  dilation  thresholding  and  the  dual-rail  coding  for  the  E-0  SLM  in 
the  logic  processor  module. 

8.  Design  and  experimental  observation 

Two  series  of  calcite  plates  are  fabricated:  for  1mm  or  2mm  separation  the 
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thickness  is  9.2mm  or  18.4mm.  The  apertures  are  22mmx22mm.  For  a  quartz 
45°  polarization  rotator,  the  thickness  is  about  2.4mm.  The  size  of  LiNb03  slabs 
is  Immxl6mmx8mm,  the  measured  half-wave  voltage  is  nearly  400V. 


Figure  6 


An  input  image  of  8x8  pixels  and  the  processed  pattern  of  edges  are  seen  as: 
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Abstract  Principles  for  optical  digital  data  processing  based  on  photon  echo  phenomenon  in 
resonant  media  are  considered.  The  various  schemes  of  optical  processors  realization  with  the 
pixel  structure  and  of  holographic  type  based  on  using  the  digital  multiplication  by  the  analog 
convolution  (DMAC)  algorithm  is  suggested.  Methods  of  optical  data  flow  switching  of 
informational  channels  on  the  basis  of  the  photon  echo  phenomenon  are  discussed. 
Interconnection  switching  schemes  in  sequential  in  time  and  parallel  in  space  codes  are 
suggested 


1.  Introduction 

In  recent  years  the  photon  echo  (PE)  phenomenon  has  attracted  attention  as  not  only  the 
method  of  kinetic  relaxation  processes  studying  in  a  resonant  media,  but  also  as  the  mean  for 
multifunctional  data  processing  presented  as  optical  images  formed  by  the  excitation  pulses  at 
different  time  moments.  Primary  proposals  of  PE  using  concerned  in  the  main  of  analog 
processing  methods  [1-3].  These  methods  are  characterized  by  very  high-speed  processing 
and  can  be  used  in  the  problem  of  pattern  recognition  [4,5].  However  there  is  interest  in 
development  of  optical  information  processing  methods  based  on  the  PE  phenomenon  for  the 
realization  of  digital  data  processing  and  for  optical  digital  processors  design. 

In  this  paper  we  analyze  the  optical  processing  for  realization  of  vector  and  vector- 
matrix  algebra  operation  as  intrinsic  (scalar)  product  calculations,  vector-matrix  and  matrix- 
matrix  multiplication. 

As  the  main  principle  of  optical  digital  information  processing  the  digital  multiplication 
by  analog  convolution  (DMAC)  algorithm  is  usually  used  [6,7].  The  DMAC  algorithm  have 
got  a  wide  extension  in  acousto-optical  processing  [8],  In  accordance  with  this  method  the 
arbitrary  number  can  be  represented  in  the  binaiy  form  and  as  a  result  the  multiplication  looks 
as  in  mixed  binary  form.  Then  it  is  necessary  to  convert  the  mixed  binary  number  into 
conventional  binary  representation.  Binary  words  of  the  mixed  digits  are  added  after  being 
shifted  with  respect  to  each  other. 
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2.  Photon  echo  based  digital  processor  for  vector-matrix  multiplication 

Due  to  unique  dynamical  in  space  and  in  time  properties  of  PE  the  coherent  responses  can  be 
successfully  applied  to  the  data  processing  systems.  To  utilize  the  DMAC  algorithm  in  the  PE 
phenomenon  it  is  necessary  to  make  binary  incoding  in  time  or  in  space.  We  consider  two 
types  of  optical  schemes  with  using  this  algorithm  in  time-domain  and  space-domain  fields 

[9]. 

2. 1  PE  pixeled  vector-matrix  multiplier  with  spatial  integration 

In  the  first  case  the  binary  representation  of  the  multiplied  numbers  is  accomplished  as 
temporal  sequential  code  and  the  processing  takes  place  in  the  independent  space  structural 
regions  (pixels)  (fig.l).  In  each  pixel  of  the  resonant  medium  the  convolution  or  correlation 
functions  corresponding  to  two  pulses  in  time  are  calculated.  On  the  whole,  the  spatial- 
temporal  light  modulator  (SLM)  has  to  form  different  images,  the  number  of  which  is  equal  to 
the  significant  digit  of  using  representation  for  under  the  time  action  of  exciting  light  pulses. 
To  work  in  this  way  the  SLM  should  have  electrically  independent  addressing  of  each  pixel. 
The  adding  operation  of  the  different  components  is  accomplished  by  cylinder  lens. 

2.2  PE  holographic  vector-matrix  multiplier  with  spatial  integration 

In  the  second  case  (the  so-called  scheme  of  holographic  type)  all  numbers  are  introduced  in 
parallel  code  and  DMAC  algorithm  is  realized  in  space.  A  compound  optical  scheme 
accomplishes  one-dimensional  Fourier  transformation  on  one  coordinate  and  other  coordinate 
forms  optical  image  and  compensates  the  phase  distortion  (fig. 2),  In  this  case  the  resonant 
medium  is  a  spectral  plane  of  multichannel  dynamic  hologram  with  the  one-dimensional 
filtering  of  spatial  frequencies.  As  a  result,  the  data  of  products  of  different  vector 
components  A\B[  in  the  mixed  binary  representation  is  formed  in  output  plane.  The  cylinder 
lens  executes  the  addition  operation  and  the  signal  that  is  proportional  to  the  scalar 
multiplication  of  two  vectors  A  and  B  is  arrised  on  the  linear  array  of  photodetectors. 


Figure  1.  PE  pi.xclcd  vcctor-niatri.x  multiplier  with  spatial  integration.  L  -  laser,  LI  and  L2  -  lenses  to  form 
plane  wave  front,  SLM  -  spatial  light  modulator,  RM  -  resonance  medium,  CL  -  cylindrical  lens  for  spatial 
integration,  ID  APD  -  ID  array  of  photodctectors,  EB  -  electronical  block  for  data  processing,  CB  -  control 
block. 
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Figure  2.  A  scheme  of  PE  holographic  vector-matrix  multiplier.  LI  and  L2  -  collimating  lenses  to  form  plane 
wave  front,  SLM  -  spatial-temporal  light  modulator,  (CL  -  SL)1,2  -  combination  of  spherical  and  cylindrical 
lenses  to  make  the  optical  image  over  Y  and  Fourier  transform  over  X,  CL  1,2  -  cylindrical  lenses  for  phase 
compensation,  RM  -  resonance  medium,  CL  -  cylindrical  lens  for  spatial  integration,  ID  APD  -  ID  array  of 
photodetectors  combined  with  a  data  processing  electronic  block. 

If  during  the  action  of  some  of  the  incident  laser  pulse  the  SLM  generates  a  set  of  different 
frames  that  corresponds  to  the  assignment  of  vector  and  matrix  data,  then  the  output  data 
stream  from  the  resonance  medium  provides  series  of  signals  in  time. 

Thus,  the  digital  data  processing  principle  is  based  on  filtering  in  a  resonant  medium 
general  time  frequencies  in  processor  with  the  pixel  structure  and  in  processor  of  holographic 
type  -  on  multichannel  one-dimensional  filtering  of  spatial  frequencies. 


3.  Interconnection  data  flow  switching  based  on  the  PE  phenomenon 

Optical  interconnection  elaboration  is  a  key  problem  in  implementation  of  informational 
network,  possessing  by  the  higher  passing  capacity.  In  the  particular  case  of  digital  data 
processing  the  data  flow  switching  may  act  as  a  control  data  transmission,  that  is  information 
channel  switching. 

Here  we  offer  some  principles  based  on  the  using  PE  phenomena.  Data  flow  switching 
can  be  easily  implemented  on  the  principle  of  vector-matrix  multiplication.  In  part  2  we 
considered  optical  processing  for  vector-matrix  conversions.  In  particular,  two  schemes  of 
binary  representation  of  inputting  numbers  with  using  of  sequential  in  time  and  parallel  in 
space  codes  are  proposed. 

In  the  first  scheme  the  setting  of  vector  components  is  initiated  by  the  specification  of 
images  with  identical  row  and  column  numbers  being  equal  vector  dimension,  and  addition 
operation  is  accomplished  by  the  cylinder  lens. 

In  accordance  with  the  principle  of  vector-matrix  interconnection  switching  in  definition 
of  matrix  A  with  the  only  element  differed  from  zero,  for  example,  Ajj^  =  1  for  i=k,  and  the 
other  A[\^  =  0,  the  control  data  flow  switching  of  channels  with  numbers  i  and  k  occurres 
(information  role  of  which  plays  the  vector  components)  and  input  data  of  channel  number  k 
fall  to  the  channel  i  on  output.  In  this  case  each  vector  B  of  the  image  row  is  introduced  in  a 
sequential  code  and  thus  each  component  of  vector  B  corresponds  to  definition  of 
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information  channel  k.  As  a  result  on  input  system  the  number  of  information  message  is 
equal  to  the  number  components  (dimension)  of  vector  B,  and  the  number  of  corresponding 
column  of  image  of  identical  vectors  coincides  with  the  channel  number.  The  photodetector 
sequence  on  output  of  system  forms  the  set  of  output  channels  in  each  of  which  the  only 
message  from  (k=l,2,...N)  falls. 

In  the  scheme  of  holographic  type  each  image  formed  by  the  spatial-temporal  light 
modulator  represents  the  vector  with  the  component  number  on  one  coordinate  equal  to 
dimension  of  defined  vector  and  with  the  number  of  binary  digit  on  the  other  coordinate. 
Thus,  so  far  as  matrix  A  defines  under  the  time  action  of  one  of  light  pulses  in  the  form  of  M 
images  (that  corresponds  to  the  definition  of  M  vectors  or  data  matrix),  that  is  in  the  form  of 
different  vector  sequence  in  time  then  the  channel  numeration  on  input  must  correspond  to 
definition  of  number  sequence  in  time. 

We  shall  call  the  sequence  being  normally  ordered  when  channel  number  with  large 
numerical  number  is  introduced  in  time  with  more  delay.  Therefore  to  accomplish  the  input 
channels  it  is  necessary  to  define  the  control  matrix  in  the  form  of  image  sequence  with  the 
only  transmission  window. 

As  a  result  on  linear  photodetector  array  in  a  parallel  code  sequentially  in  time 
components  B]^  of  vector  B  will  arise.  Besides  this  sequence  in  time  will  not  already  normally 
ordered  and  numbers  of  output  information  messages  will  correspond  to  position  of 
transmission  windows  of  matrix  A. 

Thus,  if  in  the  first  case  (in  optical  scheme  with  the  pixel  structure  in  temporal  data 
coding)  the  channel  switching  is  occur  in  space,  in  the  second  case  (in  optical  scheme  of 
holographic  type  in  spatial  data  coding)  -  in  time. 


4.  Conclusion 

Thus,  the  possibilities  of  using  PE  phenomenon  in  optical  information  processing  systems  may 
be  substantially  extended  by  means  of  digital  data  processing.  It  will  be  useful  also  for 
simultaneous  using  the  optical  parallel  processing  in  combination  with  the  accuracy  and  high 
speed. 
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Abstract 

High  capacity  (hundreds  of  channels  with  >100  Mb/s  per  channel)  switching  systems  frequently 
encounter  difficulties  in  interconnect  packaging  within  the  switching  fabric.  These  interconnection 
problems  may  be  alleviated  through  the  use  of  surface  normal  optical  interconnections  using 
optoelectronic  smart  pixels.  Two  recent  experiments  with  FET-SEED  smart  pixels  culminated  in 
the  operation  of  a  5  stage  free-space  switching  network  at  a  155  Mb/s  channel  rate.  The  network 
used  embedded  control  techniques  for  network  control  and  the  smart  pixels  consisted  of  GaAs  FET 
logic  with  MQW  SEED  detectors  and  modulators.  The  system  also  incorporated  external  cavity 
lasers,  bulk,  micro,  and  diffractive  optics,  two-dimensional  fiber  bundles,  and  novel 
optomechanics.  At  155  Mb/s,  77  of  the  80  total  pixels  in  the  system  and  31  of  the  32  input 
fibers  were  functional.  Two  of  the  network  paths  have  carried  digital  video  at  105  Mb/s  for  over 
6  months  without  readjustment.  Error  rate  measurements  on  these  paths  have  shown  a  short  term 
BER  of  10-10. 

1.  Introduction 

Smart  pixel  optoelectronic  device  arrays  represent  a  potential  solution  to  the 
interconnection  "bottleneck"  problem  present  in  many  high  performance  digital  systems.  By 
utilizing  both  high  speed  electronics  and  surface-normal  optoelectronics  in  2-dimensional 
arrays,  smart  pixels  enable  the  interconnection  and  communication  advantages  of  optics  to 
complement  the  processing  power  of  electronics  in  computing  and  switching  applications.  A 
recent  system  prototype  incorporated  GaAs  multiquantum  well  (MQW)  FET-SEED  smart 
pixels  to  implement  the  multi-stage  switching  network  shown  in  Fig.  l.[l]  This  32-input,  16- 
output  multi-stage  switching  fabric  used  5  stages  of  4x4  smart  pixel  (2,1,1)  node  arrays.[2]  In 
addition  to  the  smart  pixels,  the  system  used  computer  generated  holograms  realized  as  binary 
and  multi-level  phase  gratings  (B/MPG)  [3,4],  2-D  fiber  bundles  [5],  external  cavity 
semiconductor  lasers  [6],  high  resolution  bulk  [7]  and  microoptics  [8],  and  custom  optical 
bench  optomechanics.  [9]  Network  control  and  call-load  generation  were  done  by  a  personal 
computer,  custom  high-speed  electronics,  and  a  programmable  multichannel  digital  pattern 
generator.  Results  of  the  initial  experiment  included  operation  of  the  5  stage  network  at  50 
Mb/s  per  channel  with  15  of  the  32  inputs  populated.  In  this  paper,  we  describe  enhancements 
to  the  initial  system  experiment  that  have  allowed  us  to  exercise  the  entire  fabric  (32  inputs), 
and  increase  the  fabric’s  bit  rate  from  50  Mb/s  to  155  Mb/s  per  channel. 

2.  Modifications  to  the  original  Systeni5  experiment. 

Several  system  modifications  have  been  made  for  this  experiment,  as  shown  in  Table  1. 


132 


32  Fiber  Fiber 

Inputs  Outputs 

Figure  1.  5-stage  Banyan  intercOTnection  network  using  embedded  control  and  4x4  switching  m>de  arrays. 


The  use  of  larger  modulator  windows  (10x10  mm^)  reduced  clock/power  supply  beam 
clipping.  By  removing  the  binary  phase  grating  (BITO)  that  implemented  the  inter-node 
connection  in  our  earlier  system,  [1]  the  signal  power  per  receiver  was  increased  by  a  factor  of 
approximately  3.7.  This  enabled  higher  bit  rate  operation  and  improved  the  overall  system 
stability.  A  new  input  fiber  bundle,  phase  gratings,  and  diagnostic/analysis  software  have  been 

added  to  the  system  ,  .  .  . 

Other  changes  have  been  made  to  the  smart  pixel  circuit  design.  [2]  The  node  circuit  ot 
Fig.  2  consists  of  an  optical  receiver,  a  control  memory  to  store  a  routing  bit  extracted  from  the 
data  stream,  and  a  multiplexer/modulator  driver.  The  extracted  control  bit  determines  whether 
each  node's  multiplexer  selects  either  its  own  optical  receiver's  output,  or  the  output  of  another 
ncxie's  optical  receiver  (connected  via  inter-node  metallizations).  Design  and  processing 
modifications  have  increased  FET  currents,  although  the  smart  pixel  array's  performance  was 
partially  limited  by  lower  than  expected  threshold  voltages. 

An  important  system  issue  is  the  lack  of  spatial  unifonnity  of  FET  and  pixel  performance 
within  each  array,  and  between  arrays.  This  non-uniformity  can  lead  to  a  dramatic  difference 
between  the  operational  bit  rate  (or  required  optical  energy  for  a  given  bit  rate)  of  a  single 
device  and  that  of  an  array  (or  multiple  arrays  of  devices).  This  non-uniformity  is  mainly  due  to 
variations  in  the  receiver  inverter  threshold  voltages  across  the  airay.  In  the  initial  experirr^nt 
[1],  these  variations  were  mainly  due  to  level-shifting  and  clamping  diode  voltage  drop 
variations,  bias  voltage  variations  from  voltage  drops  along  power  supply  metallization,  and 
FET  threshold  variations.  Optimally  matching  a  single  pixel's  characteristics  requires  the 
optimization  of  bias  voltages,  incident  wavelength,  optical  power  levels,  etc..  In  a  practical 
system,  these  system  parameters  must  be  fixed  to  some  "average"  value,  and  if  pixel 
uniformity  is  lacking,  most  of  the  pixels  will  operate  sub-optimally.  Individual  smart  pixels  of 
the  arrays  used  in  the  first  experiment  operated  at  50  Mb/s  with  only  40  fJ  of  incident 
differential  optical  energy,  but  simultaneous  operation  of  an  entire  array  of  pixels  at  50  Mb/s 
required  about  300  fJ  of  differential  optical  energy.  In  the  second  generation  of  these  smart 
pixels,  the  MQW  level  shifting  diodes  were  replaced  by  Schottky  barrier  diodes  and  thicker 
metallization  lowered  the  voltage  drop  along  the  power  supply  traces.  Individual  node  circuits 
on  these  new  arrays  operated  at  155  Mb/s  with  94  fJ  of  incident  differential  optical  signal 
energy  without  system  parameter  (e.g.-  bias  voltage)  optimization.  In  contrast  to  the  first 
generation  devices,  only  small  (<  +/-135  mV)  variations  of  the  receiver  inverter  threshold 
voltages  across  the  arrays  were  noted.[l]  This  significantly  eased  the  system  parameter 
tolerances  and  eliminated  the  repetitive  parameter  adjustments  required  in  the  previous 
experiment. 
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Table  1.  Component  and  system  comparison  for  the  two  Systems  experiments. 


Figure  2.  The  (2,1,1)  node  schematic  for  the  nodes  used  in  the  second  experiment. 


FET-SEEDi  FET-SEED2  FET.SEED5 

Figure  3.  Optical  schematic  of  the  prototype  frce-space  switching  network,  with  arrows  indicating  the  beam 

propagation  direction. 


3.  System  hardware  design 

The  system’s  infinite  conjugate  optical  design  is  shown  schematically  in  Fig.3.  The 
modulators  were  supplied  with  approximately  20  mW  of  average  optical  clock  power  at  850  +/- 
0.2  nm  from  a  collimated  frequency  stabilized  external  cavity  laser.  The  laser’s  light  output  had 
a  30%  duty  cycle  at  155  MHz.  The  modulation  of  the  laser’s  output  was  somewhat  triangular, 
due  to  high  frequency  roll-off  in  its  drive  circuitry.  The  use  of  non-separable  design  BPGs 
[10]  [4]  for  spot  array  generation  provided  increased  light  efficiency.  The  BPGs  had  a  power 
unifOTTTiity  of  93%,  and  a  measured  efficiency  (neglecting  Fresnel  losses)  of  73%,  compared  to 
60%  in  the  initial  experiment.  The  use  of  1300  nm  SM  fibers  in  the  input  fiber  bundle  (versus 
the  850  nm  SM  fibers  used  in  the  first  experiment  [1])  eased  the  alignment  of  the  data  input 
lasers,  and  active  alignment  of  the  fibers  during  the  bundle  assembly  ensured  less  than  2  pm 
fiber  positioning  errors.  System  inputs  are  generated  by  810  nm  lasers  attached  the  input 
fibers,  which  are  arranged  in  an  8x4  bundle  on  a  500  mm  pitch.  This  fiber  matrix  is  imaged 
onto  the  first  node  array  through  a  dichroic  beam  combination  system  similar  to  the  PBS 

assemblies  of  the  other  stages,  except  that  PBSi  has  X/4  a  retarders  and  dichroic  mirrors 
(transmit  850  nm,  reflect  810  nm)  attached  so  that  the  unpolarized  light  from  the  input  fibers  is 
directed  onto  the  first  node  array  regardless  of  its  polarization.  This  obviated  the  need  for 
polarization  maintaining  fiber  for  the  input  fiber  bundle.  To  limit  the  total  nurnber  of  fibers, 
input  lasers,  and  laser  drivers  required,  the  fiber  bundle  provides  single-ended  inputs  to  the  2 
optical  receivers  of  each  node  in  the  first  array.  The  first  laser  (also  at  810  nm)  provides  a 
reference  level  to  convert  the  single-ended  inputs  to  differential  format.  The  beam  from  this 
laser  is  split  into  an  array  of  beams  by  the  BPG,  and  these  beams  are  combined  with  the  fiber 
input  signals  at  a  50%:50%  beam  splitter.  The  combined  arrays  of  beams  propagate  though  the 
dichroic  combiner  and  are  focused  onto  the  first  FET-SEED  array.  Due  to  the  120  pm  pitch  of 
the  first  stage’s  optical  receivers,  a  0.24x  demagnification  is  needed  between  the  fiber  and  first 
node  arrays  is  necessary  to  match  pitches.  This  introduces  clipping  at  the  objective  lens  (since 
the  beam  numerical  apertures  at  the  fiber  and  FET-SEED  planes  are  approximately  equal)  and 
results  in  a  10  dB  power  loss.  The  use  of  1300  nm  SM  fibers  improved  the  efficiency  and 
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uniformity  of  the  laser  coupling  to  the  fibers,  but  since  they  support  4  modes  at  810  nm,  a  -2.2 
dB  modal  noise  optical  power  penalty  was  incurred.  Both  of  these  issues  may  be  reduced  by 
the  use  of  a  microlens  array  in  a  hybrid  imaging  system,  as  in  Ref.  1.  Fabrication  errors  in  the 
alignment  of  the  input  beam  splitter  assembly's  ichroic  mirrors  have  been  corrected  by  the 
adStion  of  a  birefringent  wedge  arrangement  in  the  optical  path  before  the  assembly.  The  MM 
output  fiber  bundle  used  100  |im  core-diameter  fibers  on  a  400  pm  pitch. 
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Figure  4.  Ocillographs  for  the  network  connections  between  inputs  0  through  15  (rows)  to  outputs  0  through  7 
(columns),  for  system  operation  at  155  Mb/s.  Each  plot  is  composed  of  2  overlaid  oscillographs,  corresponding 
to  the  2  inputs  to  the  given  first-stage  node. 

4.  System  testing  and  results 

An  automated  call-load  generation  program  was  developed  for  this  system  to  identify  the 
effects  of  faulty  nodes  on  the  network.  This  system  automat^  the  sequential  testing  of  all  512 
paths  through  the  5-stage  network  at  various  channel  rates,  by  configuring  the  network  and 
storing  oscillographs  of  each  channel's  output  in  computer  memory.  Figure  4  shows  one 
quarter  of  the  oscillographs  recorded  by  this  test  system  during  a  system  exercise  at  155  Mb/s. 
Subsequent  data  reduction  analyzed  the  faulty  paths  to  deduce  the  faulty  nodes  within  the 
network.  At  155  Mb/s,  there  were  3  non-functional  pixels  (2  in  stage  1  and  1  in  stage  3)  and  1 
low-powered  input  fiber.  Due  to  the  embedded  control  structure  of  the  network,  a  faulty  node 
(or  a  low  powered  input  fiber)  in  one  stage  will  prevent  control  information  from  reaching  later 
stages,  thus  lessening  the  number  of  available  paths.  Thus,  the  3  bad  nodes  and  1  bad  fiber 
limited  the  network  path  availability  to  45%.  The  system  maintained  this  functionality  within 
our  lab  (+/-  1  degree  C,  Newport  table)  for  over  4  weeks.  Two  inputs  were  equipped  with 
video  CODECs  operating  at  105  Mb/s  and  these  2  paths  have  remained  operating  for  four 
months  without  adjustment.  The  short  term  BER  was  10"^^  at  105  Mb/s  for  one  of  these 
video  paths,  and  its  eye-diagram  is  shown  in  Fig.  5.  Although  some  channels  have  higher 
BERs,  it  is  possible  that  further  optimization  of  l^T-SEED  bias  voltage,  clock  beam  power, 
and  operating  wavelength  can  lower  the  BERs.  The  long  term  BERs  on  these  channels  were 
higher  (10"^),  due  to  amplitude  instability  of  the  stage  5  clock  laser. 

The  development  of  this  prototype  would  not  have  been  possible  without  the  combined  efforts 
of  the  authors  listed  in  references  1, 2  and  8. 
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Figure  5.  Eye  diagram  of  the  detected  signal  output  (from  Stages)  of  one  interconnection  path  operating  at 

105  Mb/s. 
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3-D  free-space  transmission  polarization-based  high- 
throughput  photonic  switching  systems:  architectural  issues 


V  B  Fyodorov 

Scientific  Computer  Centre,  Russian  Academy  of  Sciences,  Leninsky  pr.  32  A,  Moscow, 
117334,  Russia 

Abstract  It  is  demonstrated  how  various  types  of  optical  polarization-based  link  blocks  and 
polarization  switch  arrays  make  possible  an  effective  architecture  for  2N^x2N^  three- 
dimensional  compact  high-performance  optoelectronic  nonblocking  multistage  interconnection 
networks  for  switching  images.  The  main  characteristics  of  such  networks  are  evaluated. 


1.  Introduction 

The  multistage  interconnection  networks  (MIN)  containing  0(N/og22N)  binary  switching 
elements  suit  better  for  networks  of  a  large  size.  However,  the  practical  implementation  of  the 
polarization-based  MINs,  which  transfer  2-D  images  through  connected  pair  of  optical 
channels,  encounters  several  problems.  One  of  them  consists  in  creating  such  an  optical 
system  that  ensures  equal  optical  path  lengths  for  both  p-  and  5-polarized  light  components  in 
any  input-  output  interconnection  patterns  and  has  a  sufficient  numerical  aperture  to  transfer 
images  consisting  of  a  large  number  of  elements.  The  impact  of  the  optical  leakage  of  the 
signal  into  its  orthogonal  polarized  component  upon  the  quality  of  the  picture  transmitted  is 
poorly  known.  Unlike  the  loss  from  photon  absorption,  such  an  optical  leakage  loss  results  in 
the  fundamental  limitation  on  parameters  of  the  optoelectronic  MINs. 

In  this  paper  several  versions  of  compact  optoelectronic  multistage  MINs  for  image 
switching  between  2N^  input  and  2N^  output  channels  are  suggested.  The  switching  stages  of 
the  MIN  use  arrays  of  light  polarization  plane  modulators  (LPM).  The  regular  optical 
interconnections  of  link  stages  are  implemented  by  using  polarized-sensitive  elements  and 
free-space  optics.  The  influence  of  crosstalk  due  to  light  leakage  of  signal  into  its  orthogonal 
component  upon  the^  contrast  ratio  and  the  intensity  of  output  image  is  analysed.  The  main 
parameters  of  such  MINs  are  estimated. 

2.  Polarization-based  optical  systems  for  link  stages 

The  three  types  of  the  polarization-based  building  blocks  with  M  inputs  and  M  outputs,  that 
can  be  exploited  for  designing  of  link  stages  are  depicted  in  Fig.  1.  The  polarising  beam 
splitter  (PBS)  cubes  and  PBS  prisms  are  the  main  components  of  these  blocks.  According  to 
experimental  and  theoretical  investigations  at  certain  thicknesses  of  the  dielectric  layers  of 
polarizing  coating  the  angular  aperture  of  the  PBS  cube  can  be  as  large  as  ±  10°  for  p-  and  3- 
polarization  components  with  the  very  good  transmittance,  beam  splitting  ratio,  and 
depolarisation  factor  [1].  Theoretically  [2],  because  of  nonplanar  wavefronts,  depolarisation 
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Fig.  1 .  Three  types  of  optical  link  blocks,  implemented  on  the  base  of  PBS  prism  A,  PBS  cube  B  and  four  PBS 
cubes  C.  PBS  -  polarizing  beam  splitter,  HWP  and  QWP  -  half-  and  quarter- wave  plates,  MIR  -  mirror. 


factor  for  the  angle  of  ray  incidence  0  <  10°  is  less  than  «  O.25i'/>i^0  «  2x10''^.  This  value  is 
comparable  with  the  ultimate  optical  leakage  in  the  orthogonal  polarization  for  ideal  LPM  at 
the  same  angle  of  ray  incidence  q  equals  to  y;„  =  sin\Tisin^^/l  nj-),  where  nm=  1.5  is  some 
average  of  ordinary  and  extraordinary  reflective  index  for  LPM  [2].  If  angle  an  of  the  PBS 
prism  with  the  refraction  index  n  satisfies  the  relation  cos  =  2"'^  ^  (1+1/8a72)o.5  +  j/4j^  ~ 
2-0  Hl/4«,  inputs  and  outputs  with  the  same  name  are  placed  on  the  horizontal  optical  axis 
that  passes  through  them.  For  n=  1.73  the  angle  a  =  30*^.  In  this  case  angle  90°-a  equals  to 
Brewster  angle.  In  the  link  setup  C  the  half-wave  plates  (HWP)  depicted  by  dashed  lines  can 
be  absent  if  the  output  PBSs  are  turned  by  900>  about  the  horizontal  axis  [3,  4].  The  setup  of 
each  building  block  guarantees  equal  optical  path  lengths  of  communication  links  between 
input  and  output  channels  and  allows  one  to  apply  optical  means  with  the  ultimate  numerical 
aperture  Wa  =  nsinal{\^cosa)M,  wb  =  and  wc  =  nlAVl  for  blocks  A,  B,  and  C, 
respectively.  The  greatest  numerical  aperture  of  optical  channels  can  be  for  building  block  C. 
The  numerical  aperture  Wa  for  /i=1.5-1.8  is  about  1.3  time  less  than  that  of  block  uq.  The 
pattern  connections  for  these  link  stages  (see  Fig.  2)  are  topologically  equivalent  to  other 
more  commonly  used  ones  such  as  Banyan  and  perfect  shuffle  connections. 

3.  Architectural  issues  for  multistage  image  interconnection  networks 


To  design  an  nonblocking  image  MIN  of  size  2Nx2N  with  pattern  connections  insured  by 
building  blocks  of  types  A,  B,  and  C  it  is  necessary  to  use  the  blocks  with  M  =  N,  N/2,  N/4  ... 
2.  The  numerical  aperture  of  such  a  MIN  will  be  limited  by  the  value  u  of  the  block  with  the 
maximal  value  of  M=N.  As  an  example,  the  implementation  of  2-D  optoelectronic  MIN  of 
size  8x8,  in  which  the  blocks  of  type  B  are  used,  presents  in  Fig.  3.  The  optical  path  lengths 
of  connections  in  such  a  MIN  are  equalised  by  the  use  of  compensators  shown  as  grey 
triangles.  The  size  of  compensator  may  be  chosen  out  of  the  condition  4  =  Ip  /(/?c  -!)> 
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Fig.  2.  Intcrcomieclion  patlems  for  p-  and  j-polarized 
beams  passing  throw  lire  link  block  A  or  B  (  a)  and  link 
block  C  ( b). 


Fig.  3.  A  setup  of  8x8  optoelectronic  nonblocking 
multistage  interconnection  network  with  a  link  stage 
of  tire  type  B. 


139 


CHANNELS 


(a)  (A) 

Fig.  4.  Implementation  of  32x32  polarization-based  free-space  image  interconnection  network 
with  link  blocks  of  the  type  A  (  a)  and  type  C  ( b). 

where  4  and  Ip  are  the  lengths  of  contacted  sides  of  the  compensator  and  the  PBS  cube, 
respectively,  and  is  refractive  index  of  the  compensator.  If  the  compensator  is  made  of 
glass  with  72c  =1.8  (heavy  flint)  then  4=1-254.  The  mirrors  alter  the  interconnection  pattern 
realized  by  block  B.  However,  it  remains  topologically  equivalent  to  the  crossover  network. 

3-D  compact  optoelectronic  image  MIN  of  size  IWxlW  is  obtained  with  square  LPM 
arrays,  when  N=24  k=\,2,3  ...  .  Examples  of  such  nonblocking  multistage  MINs  of  size 
32x32  (N=4,  K=5)  are  exhibited  in  Figs.  4.  The  image  MIN  with  blocks  B  provides  for  the 
greatest  numerical  aperture  of  optical  channels,  but  the  mutual  composition  of  horizontal  and 
vertical  interconnection  units  in  such  a  structure  represents  a  difficult  enough  design  task. 
Furthermore,  the  optoelectronic  image  MIN  with  blocks  B  cannot  allow  one  to  use  square 
arrays  of  lenses  and  LPMs.  The  optoelectronic  image  MIN  with  blocks  C  is  the  most 
compact.  If  the  size  of  the  cross-section  of  all  stages  is  the  same,  the  length  of  each  link  stage 
in  this  MIN  is  approximately  2  or  1.25  time  less  than  the  necessary  when  using  blocks  A  or  B. 

4.  Performances  of  the  optoelectronic  image  interconnection  networks 

The  image  contrast  at  the  output  of  the  channel  MIN  depends  on  both  the  connection  pattern 
and  the  kind  of  picture  transmitted  through  other  channels.  Some  estimation  of  the  image 
contrast  ratio  (signal  to  noise  ratio)  in  the  output  channels  of  K  stage  Banyan  MIN  for  the 
worst  and  the  best  cases  can  be  made  under  the  following  assumptions.  First,  all  light  signals 
that  form  images  are  incoherent,  and  therefore,  there  is  no  interference  between  signals  of  the 
same  polarization.  Second,  the  decrease  in  the  intensity  of  a  data  light  signal  is  the  result  of 
the  leakage  of  a  part  of  the  light  signal  into  the  orthogonal  component  provided  that  the  total 
intensity  is  not  changed.  The  total  leakage  loss  y  in  the  LPM  and  polarization-based  elements 
of  the  link  stage  is  the  same  for  both  ON  and  OFF  state  of  the  LPM  (y  =  y^„  =  y^^)  and  does 
not  depend  on  polarization.  Other  factors  which  lead  to  a  change  in  the  intensity  are 
neglected.  Third,  all  light  signals  that  form  the  input  image  are  of  the  same  intensity.  The 
input  intensity  ofp-polarized  signal  equals  to  the  input  intensity  Bq  of  5-polarized  signal. 

If  this  conditions  are  fulfilled,  it  can  be  shown  that  for  the  worst  case  the  intensity 
(^K,s)y  corresponding  to  logical  level  of  the  signal  T'  and  the  image  contrast  (To,,,)  at  the 
output  of  the  channel  are  equal  to 

=  Ao  (i)(i-mi-h(r„,„r„)-']  (1) 

r<,„,  =  r„„„  [K(r„„„r,„)-’]/[i-t-r„„„r,„-']  (2) 

r,„i„  =  =  (i-yy^ /  l-(l-Y)^  (3) 

where  is  the  output  image  contrast  when  the  contrast  ratio  input  image  »1/y  and 
is  the  intensity  crosstalk  (noise)  corresponding  to  the  logical  level  of  the  signal  'O'. 

For  the  case  of  the  best  contrast  ratio  the  change  of  levels  of  the  dual-rail  signal, 
associated  with  the  partial  signal  leakage  into  the  orthogonal  polarized  component,  is 
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compensated  by  the  leakage  of  signals  transmitted  by  this  orthogonal  component.  Thus,  the 
input  image  goes  through  the  MIN  without  any  changes  of  the  intensity  Ax^s  and  A^^n,  as 
well  as  contrast  ratio  ,  that  is  r,„ 

Therefore,  we  can  make  a  conclusion  that  in  K  stage  Banyna  interconnection  network 
the  optical  signal  leakage  leads  to  the  spread  in  the  output  image  intensity  Ax,s  of  the  data 
signal  from  the  input  image  value  of  Ao  to  the  value  of  Ax^s  <0-5  Ao,  where  Ax^n  is 
determined  by  (1),  and  to  the  spread  in  the  output  image  contrast  from  Tout  =  r,„  to  the  value 
determined  by  (3).  As  it  follows  from  (2)  and  (3),  with  relatively  high  requirements  to  the 
output  image  contrast  ratio  r^,>,  >  10,  the  creation  of  Banyan  MIN  of  a  large  size  (for 
example  512  x  512)  is  possible  only  with  the  use  of  the  polarization-based  elements  and 
LPMs,  which  have  the  small  total  leakage  y  <  0.01:  the  overall  light  leakage  y  through  the 
LPM  and  polarization-based  elements  of  link  stage  must  be  less  than  -0.04-0. 025  to  make 
sure  the  image  contrast  ratio  =  3-5  in  MIN  of  size  128x128  what  is  possible  for  the 
practical  development. 

The  maximal  pixel  density  p^ax  can  be  evaluated  by  the  following  expression;  p^ax  = 
{lucl^Xf  =  {n  f6Xf  According  to  this  expression  and  taking  into  account  inevitable 
aberrations  of  the  optical  system  and  requirement  of  its  simple  alignment  as  well  opportunities 
for  realizing  lasers  and  photodetector  arrays,  it  can  be  considered  that  for  a  practical  design 
when  A^O.9  pm  and  n=\.5,  it  is  possible  to  form,  to  transfer  and  to  register  images  with  the 
maximal  number  of  pixels  mxm  ^  10^-10"^  on  a  centre-centre  spacing  of  about  50-100  pm  for 
array  size  of  LxL  ~  1  cm^. 

In  the  case  of  a  laser  array  with  the  total  light  power  P  =  Qr[~l  W  (ri  ==  0.1)  and  =1 
cm^  under  heat  release  in  the  laser  array  Q/=10  W  and  in  photodetector  array  Q,  =1  W  and 
the  threshold  sensitivity  of  photodetectors  ~  10  fl  [5]  the  rate  of  data  transfer  V=:QL^/E  can 
reach  a  value  of  V  0.1  Pbit  s-f  To  maintain  such  a  rate  the  image-repetition  frequency  must 
be  F  =1-10  GHz  for  the  number  of  pixels  mxm  ^ As  this  takes  place,  the  peak  total 
data  capacity  W  =  NV  10  Pbit  s'k  The  reconfiguration  time  tr  is  the  total  time  which  need 
to  change  completely  the  state  of  all  LPMs  for  establishing  a  new  interconnection  pattern. 
When  a  ferroelectric  smectic  liquid  crystal  LPM  arrays  are  used  the  time  L  is  equal,  at  best,  to 
a  few  microseconds.  The  judgement  about  overall  dimensions  of  the  optoelectronic  image 
MIN  can  be  obtained  from  the  construction  sketched  in  Fig.  4  b.  The  length  of  128  input  and 
128  output  MIN  is  equal  to  about  20  cm  under  the  cross-section  of  2. 5x2. 5  cm. 

5.  Conclusion 

Our  study  has  showed  that  now  it  is  possible  to  create  compact  bulk  high-performance 
optoelectronic  interconnection  networks  for  data  transmitting  through  any  pair  of  connected 
optical  channels  by  images  or  parallel  binary  codes.  These  interconnection  networks  use 
arrays  of  light  polarization  plane  modulators  and  link  stages  implemented  by  classical  optical 
techniques.  Such  interconnection  networks  may  exceed  electronic  ones  in  the  total  data 
throughput  and  density  of  data  flow  (bit  cm-^c"!)  on  a  few  orders. 
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Abstract.  The  architecture  of  a  "universal”  photonic  backplane  is  described.  The 
architecture  contains  a  large  array  of  "programmable"  smart  pixels  which  can  be 
configured  into  a  few  basic  states.  By  setting  pixel  states  appropriately  any  network 
including  crossbars,  hypercubes,  meshes,  and  hypermeshes  can  be  embedded  into  the 
backplane. 

1.  Introduction. 


A  photonic  backplane  consists  of  a  large  number  of  parallel  optical  channels,  typically 
1,000  to  10,000,  spaced  a  few  hundred  microns  apart  [1][2].  Each  optical  channel  operates 
at  between  100  Mbit/sec  and  perhaps  1  Gbit/sec,  and  the  peak  capacity  of  the  photonic 
backplane  is  between  0.1  Terabit/sec  and  perhaps  10  Terabit/sec.  Each  printed  circuit  board 
contains  one  or  more  smart  pixel  arrays  which  could  access  the  photonic  backplane  through 
an  Optical  Extender  Card  (OEC),  as  shown  in  Figure  1  [1].  This  paper  proposes  a  high 
performance  photonic  backplane  architecture  called  the  "HyperPlane ". 


Electronic  ICs 


Optomechanica]  Interconnecting 

Support  Structure  Optical  Channels 


Figure  I:  (a)  PCBs  with  Optical  Extender  Card,  (b)  Photonic  Backplane  based  on  OECs. 


2.  The  Photonic  HyperPlane . 

The  HyperPlane  architecture  can  be  modeled  as  a  collection  of  N  nodes  partitioned  among 
M  printed  circuit  boards  (PCB)  or  multi-chip  modules  (MCM)  that  are  interconnected 
through  a  large  number  of  optical  communication  channels  provided  by  the  photonic 
backplane,  as  shown  in  Figure  II.  Each  node  is  composed  of  one  or  more  processing 
elements  {PE)  and  a  message  processor  {MP).  The  processing  elements  within  the  node 
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could  be  either  a  general  purpose  computing  processors  or  specialized  processors  such  as  an 
ATM  switching  nodes. 

The  message  processor  has  the  responsibility  of  controlling  the  communication  between 
the  nodes  through  access  to  a  subset  of  the  Z  optical  communication  channels  {Cl,  C2,  ••• 
,Cz]-  The  MP  has  access  to  the  communication  channels  through  both  X injector  and 
Y  extractor  access  channels  labeled  [Jp  l2> 

...  JX)  [Cl,  E2,  ...  ,Ey]  respectively, 
where  X<Z  and  y<Z.  The  injectors 
provide  the  capability  of  injecting  signals 
into  a  selected  subset  I  of  communication 
channels  while  the  extractors  are  used  to 
extract  information  from  another  subset  E 
of  communication  channels.  The  node-to- 
node  connectivity  associated  with  the 
combination  of  both  the  communication  and 
access  channels  creates  a  multi-dimensional 
interconnection  design  space  which  will  be 
called  the  ” Hyp erP lane  ” .  Using  this  model, 
all  N  node  interconnection  networks  of 
degree-^  can  be  embedded  onto  the 
HyperPlane.  By  allowing  the  MPs  to 
dynamically  change  the  node  access  to  the 
backplane  communication  channels  the 
effect  of  a  dynamically  programmable 
interconnection  network  is  achieved.  Such  a 
programmable  interconnection  network 
could  implement  crossbars,  meshes, 
hypercubes  and  other  networks  as  required 
for  each  desired  software  application.  The  backplane  communication  channels  can  be 
implemented  through  space,  wavelength  or  time-division  multiplexing  or  through 
combinations  of  these  basic  approaches. 

3.  The  Smart  Pixel  Arrays. 

The  smart  pixel  arrays  for  a  HyperPlane  based  upon  space-division  multiplexing  have  been 
described  in  [4].  Pixels  can  be  programmed  to  be  in  one  of  four  states,  the  "transparent", 
"transmitting",  "receiving"  and  "transmitting-and-receiving"  states.  The  state  of  a  pixel  can 
be  programmed  by  down-loading  a  bit-stream  from  the  associated  message-processor.  The 
pixels  can  also  be  programmed  to  recognize  and  receive  messages  for  any  destination  by 
down-loading  the  appropriate  address  bits.  The  pixels  are  logically  arranged  into  an  array 
called  a  "communication  slice"  as  shown  in  Figure  III,  where  each  row  represents  a  logical 
communication  channel  which  is  many  bits  wide.  Each  slice  interconnects  a  electronic 
channels  from  the  MP  to  p  optical  channels  in  the  backplane  where  p>a.  The  ratio  of 
electrical  to  optical  bandwidth  of  a  slice  can  be  adjusted  by  varying  a  and  p.  The  data  for 
configuring  the  slice  is  loaded  in  bit-serially;  parallel  data  to  be  transmitted  enters  from  the 
top  and  parallel  data  being  received  exits  from  the  bottom.  A  smart  pixel  array  containing  a 
32-by-32  array  of  pixels  can  be  organized  in  various  formats,  i.e.,  with  one  32-by-32  slice, 
with  four  16-by-16  slices,  or  with  sixty-four  4-by-4  slices.  The  ratio  of  electrical  to  optical 
bandwidth  of  a  die  may  be  adjusted  by  varying  the  number  of  slices  on  the  die  and  by 
varying  a  and  p  within  the  slices.  These  organizations  allow  the  architect  to  vary  the 
architectural  aspects  of  the  HyperPlane . 
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4.  Graph  Embeddings  in  the 
HyperPlane. 


channel  J^^HyperPlotie  can  embed 
conventional  interconnection  networks 
T-  by  programming  the  pixels  accordingly. 

Optimal  embeddings  would  minimize 
channeiithe  leugtH  of  the  longest  optical  channel 
in  the  backplane.  Optimal  embeddings 
for  linear  arrays,  2D  and  3D  meshes, 
toroids,  hypercubes,  crossbars,  dilated 
crossbars,  hypermeshes  and  shuffle- 
channei  uctworks  havc  been  identified.  A 
typical  embedding  is  shown  in  figure  V. 
Each  box  represents  a  backplane  node 
with  one  or  more  smart  pixel  arrays  and 
associated  message  processors.  Each 
vertical  line  represents  an  electrical 
channel  between  an  MP  and  its  smart 
pixel  array.  Each  bold  horizontal  line 
represents  an  optical  channel  between 
diferent  nodes  which  is  implemented 
by  programming  the  pixels  at  each  end 
as  transmitters  and  receivers  while  those 


in  between  remain  in  the  default 
transparent  state.  The  dark  squares 
represent  transmitters  and  the  clear 
circles  represent  dynamic  receivers. 
Multiple  circles  aligned  along  a  vertical 
line  represent  a  concentration  function. 
The  embeddings  can  be  changed  in  real¬ 
time  by  down-loading  appropriate 
control  bits  to  the  smart  pixel  arrays. 

A  Circular  HyperPlane  is  a 
variation  of  the  basic  HyperPlane  where 
the  optical  communication  channels  are 
organized  in  rings.  One  physical 
implementation  of  a  large  scale 
Circular  HyperPlane  is  shown  in  Figure 
Figure  m.  (a)  A  single  ommunication  slice  and  a  iv.  The  Circular  HyperPlane  may  offer 
(b)  large  smart  pixel  array  with  multiple  slices.  improved  performance  over  the 

conventional  linear  HyperPlane  since 
the  lengths  of  the  longest  optical  channels  of  embedded  networks  may  be  reduced.  This 
implementation  is  also  scalable  to  very  large  capacities  spanning  large  numbers  of  printed 
circuit  boards. 


The  HyperPlane  architecture  can  also  be  designed  to  exploit  transparent  long  distance 
optical  transmissions.  By  appropriately  designing  the  optical  extender  card  it  is  possible  to 
allow  long-distance  optical  transmissions  to  by-pass  intermediate  smart  pixel  arrays  as  they 
travel  down  the  backplane  [1].  This  "Transparent  HyperPlane"  may  improve  performance 
by  eliminating  the  delays  associated  with  passing  through  intermediate  smart  pixel  arrays. 
Since  the  intermediate  smart  pixels  can  no  longer  transmit  or  receive  over  the  long-distance 
channels  which  bypass  them,  there  may  be  a  restriction  on  the  types  of  embeddings  possible. 
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Figure  IV:  A  physical  implementation  of  a  Circular  HyperPlane. 


Figure  V.  Embedding  of  a  crossbar  network. 
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Abstract 

We  have  demonstrated  a  representative  portion  of  an  optical  backplane  using  free- space 
optical  channels  to  interconnect  printed  circuit  boards  which  employ  FET-SEED  based  smart 
pixel  arrays.  Results  of  system  demonstrator  performance  are  discussed. 

Introduction 


Future  digital  systems  such  as  ATM  switching  systems  and  massively  parallel 
processing  computer  systems  will  have  large  board-to-board  connectivity  requirements  that 
will  not  be  met  by  conventional  electronic  backplanes.  Free-space  optical  interconnects 
represent  a  solution  to  the  needs  of  these  future  connection-intensive  digital  systems.  Such  an 
interconnect  is  capable  of  providing  greater  connectivity  via  an  optical  backplane.  This  optical 
backplane  can  be  established  using  two-dimensional  arrays  of  passive,  free-space.  Parallel 
Optical  Channels  (POCs)  which  optically  interconnect  electronic  Printed  Circuit  Boards 
(PCBs).  Such  a  backplane  could  be  capable  of  supporting  terabit/second  aggregate  capacities 
with  connectivity  levels  on  the  order  of  10,000  input/output  channels  per  PCB. 

Based  on  these  needs,  we  are  currently  developing  the  optics,  optomechanics,  and 
PCB  optoelectronic  packaging  to  demonstrate  these  high  bit-rate  optical  backplanes.  As  part  of 
this  program,  we  have  constructed  a  representative  portion  of  an  optical  backplane  capable  of 
interconnecting  two  printed  circuit  boards  using  diffractive  optics  to  establish  the  POCs  and 
FET-SEED  smart  pixel  transceiver  arrays  for  processing.  Figure  1  schematically  represents 
the  two  sided  demonstrator  system. 


Figure  1 :  Schematic  of  the  Optical  Backplane 


146 


FET-SEED  Smart  Pixel  Arrays 


The  FET-SEED  transmitter  and  receiver  smart  pixel  circuits  were  designed  and 
fabricated  using  a  batch  fabrication  process/^)  Figure  2a  shows  a  schematic  of  the  transmitter 
circuit.  At  the  input,  the  drive  FET  modulates  the  voltage  across  a  Multiple  Quantum  Well 
(MQW)  modulator  pair  resulting  in  differentially  modulated  output  light.  The  electrical  input 
impedance  was  designed  for  50  ohms  to  ensure  efficient  coupling  of  high  frequency  signals, 
which  resulted  in  high  speed  operation  of  the  optical  modulators.  Figure  2b  shows  a  schemauc 
of  the  receiver  circuit.  Here,  the  high  speed  optical  modulation  was  detected  using  the  MQW 
diode  pair,  fed  to  an  inverting  amplifier  section,  and  then  amplified  using  power  FETs  (375 
pm  gate  width)  designed  to  drive  100  ohm  transmission  lines  on  a  PCB.  Both  the  4  x  4 
transmitter  and  receiver  array  optical  windows  were  25  x  25  pm,  separated  by  50  pm,  with 
the  pixel  to  pixel  pitch  being  set  at  2(X)pm. 


Figure  2a:  Transmitter  Array 


Figure  2:  FET-SEED  Smart  Pixel  Circuits 

Each  smart  pixel  array  was  bonded  into  a  high  speed  quad  flat  pack  that  was  then 
installed  onto  a  printed  circuit  board  via  solderless  connectors.  These  connectors  permitted 
impedance  matching  of  the  smart  pixel  array  input/output  impedances  to  the  50  ohm  printed 
circuit  board  transmission  lines.  Measurements  of  the  rising  edges  of  the  packaged  transmitter 
and  receiver  smart  pixel  array  circuits  yielded  0.81 1  ns  and  2.57  ns  respectively,  in  good 
agreement  with  device  and  circuit  models. <^2)  These  circuits  were  designed  to  run  at  155 
MBits/sec  in  parallel. 


Optics  and  Optomechanics 

The  optics  and  optomechanics  implemented  in  the  demonstrator  were  designed  to 
create  optical  microchannels  using  diffractive  lenslet  arrays  and  binary  phase  gratings.  The 
board-to-board  interconnection  was  achieved  using  a  two-sided  optical  backplane  approach(^), 
and  the  optomechanics  were  based  on  a  modified  baseplate  approach  in  which  magnesium  was 
used  as  the  baseplate  material. 

Figure  3  shows  a  schematic  of  the  optical  system.  The  optical  power  supply  was 
achieved  using  light  from  the  output  of  an  argon  laser  pumped  Ti:Sapphire  laser,  which  was 
delivered  to  the  baseplate  and  collimated  using  collimating  optics.  A  binary  phase  grating  and  a 
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40.34  mm  Fourier  lens  were  used  to  provide  spot  array  generation.  Risely  beam  steerers  were 
used  for  fine  positioning  of  the  beam.  The  spot  array  was  relayed  using  2  lenslets  to  form  a  4F 
relay  at  the  focus  of  the  Fourier  lens  to  the  modulator  smart  pixel  array.  The  modulated  light 
was  then  relayed  from  the  transmitter  board  to  the  receiver  board  using  a  second  4F  relay 
created  by  two  lenslet  arrays.  Polarizing  optics  are  used  to  direct  the  beams,  with  vertically 
polarized  light  being  reflected  off  a  polarization  beam  splitter,  directed  through  a  quarter 
waveplate,  and  focused  onto  the  transmitter  array.  Modulated  light  was  then  reflected  back 
towards  the  adjacent  printed  circuit  board  through  the  quarter  waveplate  which  horizontally 
polarized  the  beams,  causing  them  to  propagate  over  to  the  receiver  board.  The  modulated 
light  was  detected,  and  the  signals  were  directed  off  the  PCB  for  appropriate  processing. 


10.4 

38.0 

40.34 

40.34 

1 

FET-SEED 


System  Performance 

The  system  was  operated  in  two  configurations.  Based  on  the  600  [im  center-to-center 
spacing  of  the  lenslets  arrays,  in  the  first  configuration  the  4  comers  of  the  modulator/receiver 
smart  pixel  arrays  were  interconnected  optically.  Figure  4  shows  the  results  of  transmitting 
data  from  board  to  board  over  one  of  the  optical  channels.  In  this  configuration,  each  lenslet  is 
supporting  on  dual  rail  optical  channel. 


Figure  4:  16  Bit  patterns  and  a  PRBS  at  50MBit/sec  -  individual  channel  performance 

In  the  second  configuration,  a  single  lenslet  was  aligned  to  support  four  dual-rail 
differential  optical  channels  in  a  Super  Pixel  configuration.  Based  on  the  efficiency  of  the 
optics,  we  could  interconnect  a  2  x  2  modulator  with  a  2  x  2  receiver  pitched  at  200  pm 
yielding  an  effective  spatial  channel  density  of  11 11  channels/cm^.  Figure  5  shows  a  typical 
recording  of  the  output  of  the  system  with  all  four  channels  being  driven  simultaneously.  To 
our  knowledge,  this  is  the  first  time  a  single  lenslet  has  been  used  to  support  the 
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interconnection  of  not  only  one  dual-rail  optical  channel  but  also  four  dual  rail  optical 
channels. 


r--.  /--V  A 


h-—.  y-s  y"\  z^-. 


: _ _ 
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Figure  5:  Four  dual  rail  optical  channels  operating  at  25  MB  it/sec  through  a  single  lenslet 
relay. 


In  conclusion,  we  have  constructed  an  optical  backplane  demonstrator  capable  of 
PCB-to-PCB  optical  interconnection.  The  free-space  optical  channels  were  defined  using 
multilayer  diffractive  lenslet  arrays  and  polarizing  optics  which  interconnected  FET-SEED 
smart  pixel  arrays.  Future  demonstrators  will  expand  on  these  results  by  implementing 
sophisticated  architectures  such  as  the  Hyperplane  which  utilize  the  increased  connectivity 

provided  by  an  optical  backplane.(^) 
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Abstract.  We  report  on  an  experimental  64  channel,  high  data  rate,  free- 
space  interconnection  network  for  massively  parallel  processing.  It  uses  VC- 
SEL  emitter  arrays,  fiber  arrays  for  detection  and  a  passive  optical  routing 
network.  It  forms  the  global  network  between  clusters  in  the  Interconnection 
Cache  architecture  [1]. 


1.  Introduction 

Our  long  term  objective  is  to  develop  a  system  capable  of  achieving: 

•  Interconnection  of  1000s  processors  in  an  interconnection  cache  architecture  [1]. 

•  Processor-to-processor  data  rates  of  1-lOGbit/sec. 

Many  free-space  optical  interconnection  schemes  have  been  suggested,  but  often 
have  difficulties  of  very  demanding  optics  (SEED  systems  [2],  or  require  diffractive  ele¬ 
ments  (and  hence  wavelength  control),  or  use  large  central  switches  (such  as  SLMs  [3] 
and  photorefractive  crystals  [4])  which  do  not  scale  well.  We  have  attempted  to  make 
the  best  use  of  the  optical  and  electronic  components  currently  available.  The  optics  is 
used  only  as  an  efficient  method  for  communication,  while  signal  generation,  routing  and 
detection  is  all  done  using  electronics.  Beam  combination  and  splitting  is  done  by  sim¬ 
ple  mirror  surfaces  (no  space- variant  redirection),  and  relaying  is  done  by  simple  optics, 
on-axis  wherever  possible,  to  reduce  the  build-up  of  aberrations. 
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Figure  1.  Schematic  of  the  optical  system. 


2.  System  Overview 

Our  experimental  system  (figure  1)  presents  the  one-way  paths  of  the  optical  channels. 
All  possible  routings  to  2  PEs,  A  and  B,  are  shown.  The  experiment  has  4  ‘columns’ 
each  with  one  simulated  ‘board’  of  16  processors  driving  an  8x8  element  VCSEL  array. 
Data  is  received  at  the  board  by  a  4x4  fiber  array  and  led  away  to  a  separate  receiver 
for  each  PE. 

Data  is  carried  on  arrays  of  parallel  slow  diffracting  gaussian  beams  from  the  VC- 
SELs  and  relayed  by  microlens  and  micromirror  arrays  and  simple  bulk  lenses.  The 
experimental  system  is  constructed  on  a  slot-rail  system  which  allows  modular  assembly, 
minimal  alignment  and  good  stability,  similar  to  slot-plate  systems  already  demonstrated 
[2]. 

Each  optical  channel  is  fixed  and  terminates  at  the  receiving  photodiode  of  one 
processor,  so  that  there  are  Af  channels  for  IV  processors.  Each  processor  selects,  via 
electronics,  the  appropriate  laser  that  transmits  onto  the  channel  of  the  target  receiver 
fiber  and  PD. 

Since  a  large  number  of  processors  must  have  access  to  the  same  channel,  an  optical 
fan-in  problem  exists.  Fan-in  occurs  at  each  beamsplitter  cube.  To  optimize  the  fan- 
in  power  efficiency,  the  reflectivity  at  each  beamsplitter  must  be  controlled,  so  that, 
beginning  at  the  most  distant  processor  from  the  target,  the  coupling  efficiencies  onto 
the  channel  are  1,  |,  |,  ..  for  a  fan-in  of  M.  This  gives  a  ^  efficiency  for  every 
connection.  However,  because  all  the  channels  within  a  combining  cube  are  generally  at 
different  distances  from  their  targets,  the  coupling  efficiencies  for  each  channel  within  a 
single  cube  are  also  different.  Hence,  micromirror  arrays  are  used  at  the  center  of  each 
cube,  and  the  channels  are  kept  spatially  separate.  This  is  the  main  reason  for  the  slow- 
gaussian  microbeam  approach.  The  approach  also  enables  easy  routing  of  individual 
beams  onto  and  off  columns. 

The  key  advantages  of  this  system  are: 

1)  Distributed  nature  -  no  central  switch  or  control  needed. 

2)  Very  few  high  speed  i/o’s  to  each  VCSEL  array,  independent  of  network  size. 

3) The  microbeam  approach  is  scalable. 

4) Only  one  laser  per  LDA  is  on,  so  low  LDA  power  dissipation,  reliable. 

5) Speed  limited  by  LD/PD  technology,  not  by  optics  technology. 
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3.  Optics 


Diffraction  limits  the  dimensions  and  hence  the  total  number  of  distinct  microbeams 
that  can  be  supported  within  a  faceted  splitter /combiner  cube.  The  relation  giving  this 
maximum  number,  Nmax,  is 


AT  __ 
V  rn.aT  — 


'KTlD 

4Am^ 


(1) 


where  n  is  the  cube  refractive  index,  D  the  cube  dimension  and  2mro  the  beam  pitch 
(where  tq  is  the  ^  radius)  in  the  cube.  Figure  2  shows  the  relation  between  N^ax  ^nd 
A  for  m  =  1.5  at  various  cube  sizes.  It  illustrates  the  high  numbers  of  channels  that  can 
be  used  with  compact  (<30mm)  optics. 


Figure  2.  Max.  number  of  channels  for  m  =  1.5. 

In  our  experiment,  the  850nm  beams  emitted  from  VCSEL  arrays  have  a  250//m 
pitch,  and  are  collimated  by  ion-exchange  planar  microlenses.  Some  micromirror  ar¬ 
rays  are  conformal  HOEs  made  in  dichromated  gelatin,  with  controlled  efficiencies  and 
optimized  for  850nm,  and  some  are  evaporated  patterned  metal. 

Board-to-board  relaying  (20-50mm)  within  a  column  is  done  by  /  =  14mm,  16  level 
diffractive  microlens  arrays  with  a  typical  efficiency  of  90%.  The  microlens  arrays  are 
bonded  to  the  surface  of  each  cube.  The  distance  from  column  to  column  is  larger  ('^ 
300mm  for  a  board  width),  and  conventional  bulk  doublets  can  be  used  in  a  4-f  imaging 
arrangement  to  relay.  Individual  beams  have  extremely  small  numerical  apertures  (NAs), 
easing  imaging  requirements. 

The  theoretical  aberrations  of  this  system  are  very  low,  because  NAs  are  very  low 
(<  0.01),  and  much  of  the  imaging  is  on-axis.  Exact  ray-tracing  simulations  on  the  worst 
case  path  of  the  experimental  system  shows  negligible  aberrations  (strehl  ratio  1).  The 
telecentric  4-f  bulk  lens  imaging  system  adds  only  significant  spherical  aberration,  which, 
in  our  case,  results  in  a  beam  pointing  and  position  error.  Simulations  show  that  for  our 
standard  40mm  lenses  at  full  7mm  field,  the  pointing  error  is  only  0.025°  and  position 
0.69/wm.  For  a  full  size  system  using  80mm  lenses  and  20mm  cubes,  full  15mm  field  has 
only  0.028°  and  2.1//m  errors.  After  8-f  errors  are  0.057°  and  8.6/xm.  Optimized  lenses 
would  give  improved  performance,  potentially  allowing  greater  separation. 
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4.  Experimental 

Our  system  could  operate  a  maximum  of  48  ‘slow’  channels  (50Mb/s  max.)  driven  from  a 
logic  analyzer,  and  1  fast  channel  (3Gb/s,  driven  from  a  bit-error  tester)  simultaneously, 
with  the  particular  channels  used  chosen  by  hardware  connections. 

Figure  3(a)  shows  a  state  analysis  diagram  of  24  channels  operating  without  error 
at  20Mb/s.  50Mb/s  operation  was  limited  by  the  performance  of  the  fiber  receivers  used 
and  all  but  4  possible  channels  (not  lasing,  or  high  threshold)  operated  correctly.  Figure 
3(b)  shows  a  iGb/s  eye  diagram  from  the  error  tester.  All  tested  fast  channels  operated 
with  an  error  rate  of  <  10“^^,  and  most  operated  up  to  1.6  Gb/s,  limited  by  driver 
circuitry  design  and  impedance  mismatching. 

Limitations  were  due  to  polarization  instability,  heating  and  variation  in  the  VC- 
SELs  properties.  Also,  lossy  metal  micromirrors  and  use  of  diffractive  microlens  arrays 
limited  signal  powers.  Future  larger  systems  will  require  better  VCSELs  and  more  effi¬ 
cient  components.  The  experimental  system  nevertheless  has  indicated  the  feasibility  of 
the  approach  for  large  networks,  particularly  based  on  the  interconnection  cache  archi¬ 
tecture. 


Figure  3  (a)  20Mb/s  state  diagram  (24  channels) 


(b)  1.0  Gb/s  eye  diagram,  typical  channel. 
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A  mesh-connected  bus  networking  topology  is  proposed  for  implementing 
the  three-stage  Clos  network  and  is  experimentally  demonstrated  using  a 
WDMA  technology. 


We  consider  networks  for  interconnecting  parallel  computers  that  are  parameterized  by 
their  diameters  (number  of  switching  stages)  and  degree  (fan-out)  of  switches  which  are  used 
to  build  the  network.  Clos  used  switch  fan-out  to  describe  a  family  of  networks  that  range 
from  the  complete  network  that  has  maximum  fan-out  and  minimum  switching  diameter  to  the 
"Benes"  network  with  nearly  minimum  fan-out  and  larger  networking  diameter.  Electronic 
networks  tend  towards  the  small  fan-out/large  diameter  end  of  the  spectrum  due  to  limitations 
on  the  fan-out  of  electronic  switches.  The  fan-out  of  systems  of  optical  switches  is  anticipated 
to  be  significantly  higher  than  electronic  switches.  Thus,  it  has  been  suggested  that  optics  can 
be  used  to  build  the  complete  network  which  is  the  most  desirable  of  Clos's  family  of 
networks.  Even  optics,  however,  has  practical  limits  on  the  fan-out  of  switches  in  any  given 
configuration.  In  this  paper,  we  describe  an  implementation  of  a  network  with  significantly 
smaller  degree  and  slightly  larger  network  diameter  than  the  complete  network.  The  network 
that  we  implement  is  called  the  mesh-connected  bus  network  which  is  the  second  most 
desirable  network  defined  by  Clos,  which  is  widely  referred  to  as  the  Clos  network. 


a  crossbar 
or  a  bus 


a  crossbar 
or  a  bus 


(b) 


Fig.l.  (a)  a  lb-node  Clos  network  and  (b)  its  MCB  topology. 
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The  Clos  network  is  often  depicted  in  its  linear  node  distribution  format,  such  as  the 
one  shown  in  Fig.l(a).  We  remap  this  ID  format  to  a  2D  linking  topology  of  Fig.l(b)  which 
can  be  referred  to  as  the  mesh-connected  bus  (MCB)  topology.  The  three  switching  stages  are 
embedded  into  a  permutation  along  a  row,  followed  by  along  a  column,  and  finally  along  a 
row  in  a  grid  structured  2D  node  array.  It  can  be  shown  depending  on  the  number  of  switches 
used  in  each  row  and  column,  the  MCB  offers  at  least  a  rearrangable  non-blocking  switching 
environment.  For  some  interconnect  topologies,  efficient  embeddings  leading  to  a  two-stage 
or  even  a  single  stage  routing  are  possible.  It  has  also  been  shown  that  the  MCB  can  deliver 
comparable  performance  under  randomized  routing  environments.  Details  of  other  important 
embedding,  off-line  as  well  as  on-line  communication  features  of  the  MCB  network  can  be 
found  in  Ref.[l]. 


To  optically  implement  the  MCB  interconnect,  we  propose  to  use  the  wavelength- 
division  multiple  access  (WDMA)  concept  which  allows  to  communicate  among  N  users  with 
only  VN  wavelength  channels.  To  support  the  WDMA  communications,  a  critical  passive 
component  is  a  star-coupler  which  serves  to  broadcast  the  wavelength  coded  signals.  A  star- 
coupler  is  typically  fabricated  based  on  a  butterfly  or  a  back-to-back  tree  connections  of 
individual  2x2  fiber-couplers.  It  can  be  seen  that  as  many  as  3  VN  such  composite  star- 
couplers  are  needed  in  order  to  establish  the  MCB  networking  using  a  WDMA  scheme.  One 
aspect  of  this  research  is  to  try  to  conceptualize  optical  interconnect  schemes  which  can  greatly 
reduce  the  overall  hardware  complexity  of  a  network.  To  accomplish  this  goal  for 
implementing  a  WDMA  MCB  interconnect,  simple  free-space  optical  components  such  as 
cylindrical  optical  lenses,  mirrors,  or  ID  gratings,  etc.  for  handling  the  star-coupling 
functions  of  the  WDMA  signals 
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Fig.2.  (a)  -  (d),  four  possible  ways  of  making  free-space  star-coupler  arrays. 
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were  proposed  (see  Fig.2(a)  -  2(d)).  Each  approach  can  supply  a  parallel  array  of  VN  star- 
couplers  necessary  for  one  stage  of  the  WDMA-based  MCB  routing.  While  the  first  two 
approaches  allow  only  the  so-called  continuous  broadcasting,  the  last  two  approaches  can 
perform  the  discrete  broadcasting  thereby  reducing  optical  power  lost. 

The  proposed  system  concepts  are  experimentally  verified  through  routing  both  low 
(10  MHz)  and  high  (1,25  GHz)  bandwidth  optical  signals  (see  Fig. 3  for  our  system 
connection  diagram)  [2].  In  the  low  bandwidth  situation,  some  base-band  video,  audio  and 
RS-232  data 


Fig.3.  Flow  chart  of  our  demonstration  setup. 

are  multiplexed  in  the  square-wave  frequency  modulation  format  occupy  a  20  MHz 
bandwidth.  Conventional  Fabry-Perot  lasers  operating  in  the  range  of  1290  -  1340  nm  were 
used.  Our  power  measurement  confirmed  that  a  fan-out  to  as  many  as  36  channels  could  be 
established  along  each  bus  making  a  network  linking  1,296  nodes  power- wise  possible. 
Through  3-stages  of  free-space  routing  and  associated  electric-to-optical  and  optical-to-electric 
conversions,  an  overall  49  dB  signal-to-noise  ratio  for  video  signal  reception  was  still 


Fig.4.  A  typical  WDMA  video  transmission  and  selection  quality. 
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maintained.  The  images  shown  in  the  four  quadrants  of  Fig.4  are  the  three  wavelength  coded 
video  input  images  and  one  selected  output  image  through  WDMA  filtering.  The  sound 
channels  can  also  be  added  or  dropped  at  each  stage  electronically.  Our  high  bandwidth  data 
transmission  experiment  involved  using  high-quality  DFB  lasers  operating  around  1310  nm. 

The  corresponding  optical  system  power 
measurement  illustrates  that  at  the  transmission 
rate  of  1.25  Gb/s,  our  WDM  bus  array  can 
accommodate  16  channels  per  bus  or 
equivalently  256  nodes  in  the  network  with  a 
guaranteed  receiving  bit-error-rate  of  lower 
than  2  X  10-13  (see  Fig.5  for  the  bit-error  rate 
measurement  results).  The  photograph  of  the 
Input  pouer  ( dBm )  overall  system  is  shown  in  Fig.6  where  the  TV 

monitor  shown  at  the  right-hand  side  displays 
Fig.5.  Bit-error-rate  test  result  of  1.25  Gb/s  ,  _ 

data  transmission  in  MCB.  four  identical  images  at  its  f  our  quadrants. 

They  are  the  input  image,  the  received  images  after  being  transmitted  through  the  first,  the  first 

and  second,  and  all  three  stages  of  our  MCB  network.  The  majority  of  space  on  the 

breadboard  was  occupied  by  electronics  for  handling  signal  multiplexing/demultiplexing, 

modulation/demodulation,  electric-to-optical  conversion,  and  optical-to-electric  conversion. 


Fig.5.  A  picture  of  our  demonstration  system. 
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Free-space  optical  schemes  to  minimize  the  number  of  switches  for  non-blocking 
multicast  and  broadcast  interconnect  applications  are  introduced  and  experimentally 
demonstrated  based  on  the  multidimensional  multiplexing  concept. 

A  non-blocking  generalized  switching  network  of  N  nodes  is  the  one  where  each  input 
can  broadcast  or  multicast  its  message  to  any  combination  of  output  nodes  without 
experiencing  any  internal  blocking.  In  case  that  one  input  is  to  be  multicast  tojj  <  N,  outputs 
the  remaining  N-j  output  nodes  can  still  be  accessed  by  any  of  the  remaining  input  nodes 
without  experiencing  internal  blocking.  Using  this  criteria,  most  popular  point-to-point 
switching  networks  do  not  qualify  for  handling  non-blocking  communications  for  the 
generalized  switching  applications.  Using  the  space  switching  concept  a  cross-bar  network  is 
about  the  only  one  which  handles  strictly  non-blocking  general  purpose  networking.  A  cross¬ 
bar  has  to  use  an  order  of  basic,  i.e.  2x2,  switches  making  the  hardware  implementation 
of  a  large  network  a  very  difficult  task.  The  reduction  of  switching  complexity  for  a  general- 
purpose  strictly  non-blocking  network  is  not  possible  using  a  purely  ID  switching  technique 
since  such  a  network  has  to  make  available,  at  each  of  its  N  output  nodes,  all  its  N  input 
signals  so  that  any  output  node  can  independently  select  any  input  signal. 

In  our  approach,  instead  of  a  purely  ID  switching  the  multiple  multiplexing  techniques 
are  used  [1].  Here,  the  multiple  multiplexing  implies  that  the  explicitly  multiplexed  signals  in 
the  time,  wavelength,  or  space  domain  need  to  be  multiplexed  one  on  top  of  the  other  again. 
Let  us  assume  for  a  general  tuple  multiplexed  system  of  N  nodes  with  N  -  jij  fj.2  •••  fJ-k; 
each  has  a  unique  identity  composed  of  k  indices.  Since  each  of  the  N  receiving  nodes 
receives  entire  inform  from  all  the  input  nodes,  to  select  an  input  at  an  output  node,  the 
receiver  has  to  access  each  of  the  k  dimensions  and  to  select  (by  demultiplexing)  the 
corresponding  channels.  Selecting  a  specific  sub-channel  in  dimension  d  is  equivalent  to  a 
complexity  of  0(}X(i-l ),  /.e.  the  number  of  basic  2x2  switches  is  0(}Xd-l)>  and  for  A  users  the 
total  switch  count  is  OK/dd-I }N).  Thus,  the  total  complexity  of  the  network  is  0  [((Xj  +  ^2  + 
...  -I-  fJ.k-k)N].  Since  all  jU/  (i  =  1,2  ...  k)  are  symmetrical,  a  minimum  complexity  will  occur 
when  all  jii  are  identical.  Thus,  a  minimum  complexity  of  kN(N^i^-l)  is  obtained  for  fXj  - 
=  ...  =  fXk  ~  The  overall  complexity  can  be  reduced  by  increasing  k,  however,  since 
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the  minimum  number  of  sub-channels  in  any  dimension  is  two,  no  more  than 
dimensions  can  be  used  for  a  total  of  N  signals.  Thus,  for  k  =  log2^,  the  complexity 
approaches  the  absolute  limit  of  which  ties  the  information  lower  bound  pointed  out 

by  Shannon  [2].  Using  electronics,  a  k=2  system  (time,  and  space)  can  be  implemented.  The 
use  of  optical  fiber  components  allows  k=3  with  an  additional  wavelength  dimension.  When 
the  free-space  optics  is  used,  angular  dimensions  in  two  perpendicular  planes  as  extra  degrees 
of  freedom  can  ht  included  making  a  k=5  system  feasible.  One  additional  advantage  of  the 
multiple  multiplexing  is  that  instead  of  squeezing  all  complexity  into  a  single  dimension 
making  the  system  working  at  its  extreme  limit,  each  dimension  handles  its  fair  share  of 
switching  complexity  at  its  comfortable  range.  In  Fig.l,  switching  complexity  comparisons 
are  shown  for  various  multicast  non-blocking  circuit  switches.  The  standard  cross-bar  and  the 

tree-based  cross-bar  switches, 
each  of  which  consumes  about 
iV  2  basic  switches,  are 
represented  by  the  top  curves, 
while  the  theoretical  minimum 
complexity  is  denoted  by  the 
bottom  curve  which  labels 
Nlog2^’  Between  them  are  the 
two  curves  labelled  k=2  and 
k=5,  which  correspond  to  the 
two  mentioned  multidimensional 
multiplexing  cases.  It  can  be 
seen  that  for  large  scale  network, 
the  proposed  concept  offers  a 
significant  hardware  reductions. 

Fig.l.  plots  of  network  switching  complexity 


To  confirm  the  proposed  multi-dimensional  switching  concept,  an  N—9,  k~-2  system 
utilizing  two  mutually  orthogonal  angular  dimensions  is  sketched  in  Fig.2  for  its  top  and 
side  views.  For  the  broadcast  and  select  operations,  respectively,  a  2D  Dammann  grating  and 
two  ID  optical  deflector  arrays  are  employed.  The  Dammann  grating  splits  an  input  beam  into 
9  angularly  encoded  beams  and  multiplexes  them  with  the  help  of  a  spherical  lens  into  9 
spatial  spots  to  be  selectively  switched  by  the  two  beam  deflector  arrays.  Hie  first  and  second 
deflector  arrays  select  one  out  three  angles  along  the  XZ  and  YZ  planes,  respectively.  In  this 
way,  instead  of  using  81  switching  states,  only  2x9x3  =  54  switching  states  are  used.  The 
larger  the  N,  the  more  savings  of  the  switching  states. 
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The  angular  selective  switching  can  be  implemented  using  a  multichannel  A-0 
deflector  array.  Since  the  number  of  angular  channels  is  fundamentally  related  to  the 
.  X  spatial  and  temporal  frequency 

1  z  X  deHectors  y  deflectors 

y resolutions,  various  technolo¬ 
gical  and  fundamental  limita¬ 
tions  restrict  the  network 
interconnect  capacity.  We  have 

lensiet  array  to  receivers  ^  i 

performed  some  iirst-order 
calculations  of  limitation  on 
such  a  capacity  based  on 
parameters  of  the  multichannel 
A-0  deflector  array.  Fig.3 

X  deflectors  y  deHectors 

(ss,)  (ssj)  plots  network  capacity  hmits 

Fig.2.  Top  and  side  views  of  the  proposed  switching  setup.  due  to  the  A-0  deflector's  time 

bandwidth  product,  the  grating's  space-bandwidth  product,  and  the  grating  to  A-0  cell 
interfacing  numerical  aperture,  where  the  parameters  are  fj,  the  focal  length  of  the  lens;  the 


from  transmitters  \  *  aperture 


central  frequency  of  the  A-O 
deflector;  A,  the  optical 
wavelength,  z  and  h,  the 
thickness  and  width  of  the 
transducer  of  each  deflector, 
and  s,  the  spacing  between 
two  consecutive  deflectors. 
More  than  32  angular 
channels  can  be  used  in  each 
dimension  making  a  system 
interconnecting  more  1,024 


Channel  Spacing  (mm) 


nodes  possible. 


Fig.3.  Limit  on  the  number  of  multiplexible  angular  channels. 


Experimental  verification  of  the  proposed  k=2  system  was  performed  using  two  8- 
channel  Te02  A-0  deflector  arrays  fabricated  by  Brimrose  [3].  Because  of  using  a  8x1  and  a 
1x8  deflector  arrays  in  cascade  instead  of  two  ideal  8x8  deflector  arrays,  our  proof-of-principle 
experiment  was  designed  to  demonstrate  one  row  of  the  non- blocking  selection  of  N  =  8x8  = 
64  node  switching  system.  The  experimental  setup  is  photographed  in  Fig.4  where  a  folded 
system  (by  two  mirrors  to  reduce  the  longitudinal  dimension)  of  geometiy  of  Fig.2  is  used.  A 
2D  mask  containing  8x8  round  holes  (see  Fig.5(a))  was  used.  To  better  track  this  input  pattern 
in  the  system,  eight  holes  in  the  diagonal  direction  was  blocked.  The  duplication  of  the  single 


_  focused  pattern  into  8  patterns 

needed  for  the  deflector  array 
was  performed  by  the  Dammann 
grating.  In  Fig. 5(b),  a 
photograph  containing  three 
dmultaneously  deflected  patterns 
from  the  2^^,  and  8^  Bragg 
deflectors  is  shown.  Each  of  the 
3  deflected  8x8  patterns  was 
then  individually  tuned  to  allow 
Fig.4.  A  i^to  of  the  donemstration  system.  only  one  row  of  the  8  Bragg 

degenerated  spots  to  pass  a  narrow  slit  attached  to  the  second  lens  (see  Fig.5(c)).  This  sht  and 
lens  combination  was  used  to  perform  an  unit  magnifleation  imaging  of  the  light  disuibution  at 
the  first  Bragg  deflector  array  to  the  second  Bragg  deflector  array.  Fig.5(d)  shows  the 
photograph  taken  at  the  second  Bragg  deflector  array  plane  where  only  3  spots  appear  due  to 
our  slit  selection.  Since  each  of  the  3  spots  still  contains  8  angularly  multiplexed  signals,  the 
tuning  of  the  second  stage  Bragg  — 


mtn 

_  1  '  , 

lllKtli 

. f 

•  <  1  <  ! 

. . x. . 

deflector  could  further  select  a 
single  spot  from  each  of  the  three  8 
spot  arrays  using  still  another  slit- 
lens  combination  with  the  slit  ,  ^  1 1>  | 

perpendicular  to  the  first  one.  In  our 
case,  since  all  the  3  spots  focused 
onto  a  single  Bragg  deflector,  they 
were  deflected  by  the  same  angular 
quantity.  By  arranging  the  deflecting 
angle  so  that  one  out  of  8  spots  in 
each  of  the  three  first  stage  selected  =  •  '  ^  *  * 

beam  arrays  passed  through  the  slit-  Fig.5.  Experimental  results  of  switohed  signals, 
lens  combination,  we  obtain  the  final  selection  result  at  the  output  plane  (see  Fig.5(c)).  Thus, 
using  a  cascade  of  two  Bragg  deflector  stages,  the  operational  principle  of  the  proposed  novel 
complexity  reduction  scheme  useful  for  the  non-blocking  multicast  photonic  network  was 
confirmed. 
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Optical  network  for  a  general  purpose  massively  parallel  optoelec 
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An  interconnection  network  applied  to  a  general  purpose  massively  parallel  computing  architec¬ 
ture  is  discussed.  Several  criteria  enable  us  to  choose  a  specific  network.  An  optoelectronical 
system  is  proposed.  Experimental  results  and  simulations  are  presented.  The  network  is  based  on  a 
2D  perfect-shuffle  optical  interconnection  pattern  coupled  to  an  electric  mesh  network.  Light  is 
propagated  in  free  space  and  deflected  by  a  microprim  array. 


1.  Presentation 

We  have  studied  in  our  laboratory  a  processing  architecture  called  OEDIPE  (OptoElectronic  Digital 
PEs).  It  is  a  general  purpose  massively  parallel  optoelectronic  SIMD/SPMD  (Single  Instruc¬ 
tion/Process  Multiple  Data)  architecture.  OEDIPE  is  composed  of  a  electronic  processing  element 
(PE)  array,  an  electronic  planar  memory,  a  global  control  unit  and  optical  interconnections.  Data  are 
organized  in  a  matrix.  Different  kinds  of  applications  can  work  on  this  architecture,  such  as  data 
base,  image  processing  and  matrix  operations. 


2.  Network  structure 

Because  of  the  difficulties  to  implement  a  crossbar  network  [1]  with  a  level  of  parallelism  we  want  as 
high  as  possible,  it  is  prefered  here  to  select  a  communication  kernel  extracted  from  the  needs  of  data 
motions  required  by  the  aimed  applications.  This  kernel  is  composed  of  neighbourhood  data  motions, 
row  and  column  shifts,  row  and  column  broadcasts  and  matrix  transposition. 

Bidimensional  optical  multistage  networks  are  attractive.  According  to  the  optical  SBWP  (Space 
Bandwidth  Product),  these  networks  can  be  implemented  with  a  high  level  of  parallelism.  A  multistage 
network  is  composed  of  several  stages  of  fixed  long  link  patterns  combined  with  dynamic  short  link 
patterns  (the  switch  arrays).  In  our  approach,  the  fixed  long  link  patterns  are  optical  while  the  switch 
arrays  are  implemented  with  electronic  technology  which  is  efficient  for  neighbouring  connections. 
Some  of  the  multistage  networks  can  not  produce  a  given  data  motion.  For  the  others,  the  way  of 
running  depends  on  the  kind  of  network  considered.  We  have  chosen  a  multistage  network  which  can 
respond  to  the  following  criteria  : 

-  minimizing  the  number  of  feed-back  loops  through  the  network  for  the  desired  communica¬ 
tions  ; 

-  achieving  all  the  communication  kernel  with  a  single  configuration  order  in  each  stage.  This 
order  is  broadcasted  over  all  the  switches  of  a  stage.  Then,  all  the  switches  of  a  same  stage  are 
set  in  an  identical  state.  This  property  is  very  important  to  reconfigure  as  quick  as  possible  the 
connection  pattern. 
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Classical  multistage  networks  such  as  crossover  [2]  or  shuffle-exchange  [3,  4]  are  not  well  fitted  with 
the  previous  constraints.  So,  we  have  developped  a  specific  one,  the  2D  Enhanced  Shuffle-Exchange 
(ES-E)  multistage  network  (figure  1).  It  is  a  bidimensional  shuffle-exchange  network  with  additional 
neighbouring  connections  between  nods  in  a  same  stage.  These  new  connections  are  represented  in 
dotted  lines  in  figure  1.  For  inputs,  ES-E  network  contains  log2(N)  stages.  The  fixed  long  link 
patterns  between  two  consecutive  stages  are  identical  and  correspond  to  2D  perfect-shuffle  schemes. 
This  network  can  be  reduced  to  the  last  stage  over  which  are  performed  log2(N)  loops. 

2P  |:^rfect-shuffle 


4  I/O  switch  etements 

'  neighbourhood  networks  - 

( ctynamic  connections ) 


Figure  1  :  a  2D  Enhanced  Shuffle-Exchange  multistage  network  with  =  8  x  8  inputs. 


A  single  stage 
implementation 
of  the  ES-E 
network 


A  duplex  bus 


A  duplex 
broadcasting 
network 


An  implementation  of  OEDIPE  architecture 
is  proposed  on  figure  2.  This  architecture  is 
composed  of  three  optical  fixed  networks  : 

-  a  2D  perfect- shuffle  network  con¬ 
nected  to  the  PE  array  ; 

-  a  duplex  bus  between  the  PE  array 
and  the  planar  memory  ; 

-  a  duplex  broadcasting  network 
between  the  global  control  unit  and 
the  PE  array  through  the  planar 
memory. 

The  ES-E  network  implementation  is  a  sin¬ 
gle  stage  one.  PEs  are  interconnected  with 
an  electrical  mesh  network  so  that  they  si¬ 
mulate  switch  arrays. 


Figure  2  :  OEDIPE  architecture  design. 
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3.  Experimental  approach 

An  optical  solution  to  implement  the  ES-S  network  is  presented  now. 

The  I/O  stage  of  the  PE  array  is  composed  of 
e  a  2D  array  of  pairs  of  emitters  and  receivers. 
All  beams  are  collimated  by  a  microlens 
array.  After  crossing  optical  components 
which  perform  the  desired  deflection  pattern, 
all  the  beams  are  focused  by  the  same  micro¬ 
lens  array  on  the  corresponding  receivers 
Among  the  technological  solutions  which 
can  be  imagined  to  achieve  the  deflection 
Figure  3  :  an  optical  2D  perfect-shuffle  implementation  function,  we  can  mention  ;  a  microprism  ar- 
using  a  microprism  array.  ^  micrograting  array  ;  a  2D  CGH  which 

can  realize  the  collimation  and  the  deflection 

functions  by  the  use  of  off  axis  Fresnel  microlens. 

We  have  first  studied  the  microprism  solution  (figure  3).  Our  goal  is  to  estimate  the  deflection  limita¬ 
tions  of  this  technic  and  to  verify  if  it  is  able  to  produce  a  highly  parallel  2D  perfect-shuffle  pattern. 


The  experimental  setup  is  described  on  figure  4.  It  is  composed  of  the  following  elements. 

-  An  optical  source  which  is  a  pigtailed  laser  diode.  The  fiber  has  a  5  pm  core  diameter  which 
permits  to  simulate  a  VCSEL  (Vertical  Cavity  Surface  Emitting  Laser  diode). 

-  Two  NPL  (National  Physical  Laboratory)  103x103  250  pm  diameter  microlens  arrays. 

-  A  BK7  glass  flat  which  can  be  rotated  to  simulate  different  prism  angles. 

-  An  acquisition  system  for  detection. 

NPL  103  X  103,  ([1  =  250  urn 

microlen  arrays  acquisition  systGm 

'  V  lens  system 


Figure  4  :  experimental  setup  to  study  the  deflection  system. 


To  define  the  characteristics  of  the  network,  we  have  imposed  to  the  Fresnel  reflection  loss  on  the 
microprism  face  to  be  less  than  10  %,  and  to  the  collecting  microlenses  to  troncate  at  l/e^  the  beams 
which  are  supposed  gaussian. 

So,  the  maximum  tolerable  value  for  the  deflection  angle  is  63°and  the  glass  flat  thickness  is  20  mm. 
To  evaluate  the  performance  of  the  system  we  have  studied  the  two  extreme  values  of  deflection 
angle  :  0°  and  63°. 


4.  Results 

Theoretical  simulations  and  experimental  measurements  were  done  to  evaluate  the  following  parame¬ 
ters  :  -  the  shift  D  (see  figure  4)  which  limits  the  level  of  parallelism,  ; 

-  the  optical  power  budget  between  the  emitter  and  the  receiver  ; 

-  the  output  image  quality  which  will  define  the  receiver  size. 
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Experimental  results  are  ctMnpared  with  simulations  in  table  5  and  figure  6. 


T«i»to  5  ;  i^mipattoon  totnwen  e3^>flii»ntal  rwiW  and  sIwwitoMen* 


SIMULATION 

EXPERIMENT 

o'tia 

O’flft 

63' m 

SritftD 

0  mm 

11,184  mm 

0  mm 

11,18*0,06  mm 

Conroetton 

tminsmtttanoe 

70,6% 

I  61,1  % 

i _ 

64%±6 

54%±5 

0*  flit  63*  flit 


Experiment  Experiment 

Figure  5  :  comparison  between  experimental  output  images  and  simulations. 


As  we  can  obsm^e  it,  the  shift  D,  the  transmittance  and  the  output  image  are  in  good  agreement  with 
simulaticxis.  Moreover,  we  can  see  the  image  quality  and  the  tranmittance  are  quite  conv^ent. 


Characteristics  projected  by  simulation  for  a  folded  2D  perfect-shuffle  network  (see  figure  3)  in  OE- 
DIPE  architecture  using  SFl  1  microprisms  are  presented  in  table  7. 


Tabl*  7  :  i»ole©tod  by  .ImulBaorw  ter  a  fokted  20  pwtect-rfmfftB  p«a»m  j 

Case  1 

200  pm  mtero'isns  dkwneter 

Case  2 

400  pm  mi  cro  tens  diameter 

Number  of  inputs,  ^P 

126x128 

266x266 

PE  array  area 

-  2  X  2  lncH> 

-8x8lncf^ 

MteropHsm  array  tidiness,  e  (see  figure  3) 

3  cm 

12  cm 

Maximal  single  Imk  ^nuatten  lo^ 

#3dB 

#3  m 

5.  Conclusion 

We  have  presented  a  general  purpose  optoelectronic  massively  parallel  architecture.  Processing  is  left 
el^tronic  and  communications  are  optical.  A  specific  optical  network  was  developped  for  an  over  100 
X  100  parallel  architecture  including  a  very  compact  single  fold^  stage  netwcffk. 

This  network  uses  microlens  and  microprism  arrays.  Further  improvements  will  use  off  axis  Fresnel 
microlens  arrays.  These  components  are  weU  fitted  with  planar  electronic  integrated  technology. 

Reference 

[1]  Frac^  M.  at»l  all  1991  Proceeding  ofT  Journ.  dt  itude  desfonctions  opt.  ds  Vordinat.  (Brest  September  91) 

[2]  Murdocca  M  1990  Int.  J.  of  Optoelectrooics  5  (2)  191-203 

[3]  Cteng  L  and  Sawcbuck  A  A  1992  App/.  Opt.  31  (26)  5468-5479 

[4]  Stone  H  S  1971  IEEE  Trans,  on  computers  20  (2)  153-161 


Inst.  Phys.  Conf.  Ser.  No  139:  Part  II 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  lOP  Publishing  Ltd 


165 


Performance  analysis  of  optical  multistage  networks  and 
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Abstract-  A  bit  sequence  can  be  routed  properly  by  a  banyan  network  using 
integrated-optic  directional  couplers  and  the  possible  optical  interpretations  of  this 
network  is  investigated.  In  this  paper  we  consider  holographic  interconnections 
between  nodes  and  stages  for  some  types  of  the  optical  interconnection  networks  and 
represent  experimental  results  for  holographic  interconnections  in  photorefractive 
waveguides  formed  by  titanium  and  iron  indiffusion  in  LiNb03. 


1.  Introduction 

One  of  the  most  promising  applications  of  optical  computing  techniques  is  interconnections 
for  VLSI  systems.  The  communications  crisis  in  tiie  area  of  VLSI  circuits  and  systems  is  very 
serious.  Problems  of  clock  distribution  and  data  communication  inside  VLSI  chips  are  solved  by 
using  optical  interconnections  with  optical  fibers  and  holographic  techniques  [1].  V)lume 
holograms  in  waveguides  offer  a  straightforward  means  of  interfusing  dynamically  reconfigurable 
interconnections  with  integrated  optoelectronic  devices.  Previous  uses  of  thick  holograms  in 
waveguides  have  included  grating  couplers  and  distributed  feedback  lasers  [2].  Holograms  for 
dynamic  applications  have  also  been  considered,  especially  in  photorefractive  crystals.  In  this 
paper  we  consider  holographic  interconnections  between  nodes  and  stages  for  some  type  of  the 
optical  interconnection  networks  and  represent  experimental  results  for  holographic 
interconnections  in  photorefractive  waveguides  formed  by  double  indiffusion  in  LiNb03. 

It  is  shown  that  there  exists  a  topologically  equivalent  class  of  multistage  interconnection 
networks  which  include  the  indirect  binary  n-cube  network,  and  that  the  multistage  interconnection 
networks  in  the  class  can  be  developed  based  on  the  «-cube  network [3]. 

In  this  paper  we  investigate  the  possible  optical  interconnections  and  propose  non- 
blocking  optical  interconnections  and  the  use  of  integrated  holographic  devices  in  the  proposed 
networks. 

2.  Motivation  of  holographic  interconnections 

In  many  interconnection  networks  in  use,  regularity  is  observed.  A  hologram,  with  the 
aid  of  several  conventional  optical  components,  can  be  useful  to  realize  complex  and  massive 
interconnections  among  processors.  Additionally,  thick  volume  holograms  provide  very  high 
diffraction  efficiency  and  low  crosstalk  for  interconnections.  Elementary  holographic  gratings  can 
be  used  to  implement  interconnection  links  between  individual  processing  elements  of  two  distinct 
planes  in  a  multi-layered  optical  network.  For  example,  a  5-grating  hologram  can  perform  a  2-D 
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mesh  intercoimection  network  of  any  size,  and  a  2^-nodc  binary  hypercube  interconnection 
network  can  be  easily  realized  by  an  n-grating  volume  hologram. 

In  addition,  planar  multiplexed  holograms  have  been  suggested  for  coupling  between 
totally  internally  reflected  beams  propagating  in  a  dielectric  substrate.  Holograms  have  also  been 
demonstrated  which  couple  light  from  the  evanescent  field  of  a  guided  mode  in  a  waveguide. 

The  principal  advantage  of  using  holographic  elements  in  photorefractive  materials  for 
interconnect  applications  is  the  ability  to  write  the  hologram  with  a  single  exposure  and  to  erasure 
by  the  double-exposure  technique  with  a  variable  phase  shift  between  the  two  recording  gratings. 

3,  Waveguide  photorefractive  hologram 

Consider  a  planar  waveguide  fabricated  by  diffusion  of  Fe  in  x-cut  LiNbO^.  The  y-axis  is 

defined  to  be  the  principal  axis  of  propagation  in  the  waveguide.  The  x-axis  is  normal  to  the  plane 
of  the  waveguide  and  the  y-axis  is  transverse  to  the  z-axis  in  the  plane  of  the  waveguide. 

After  the  first  stage  of  Fe  in-diffusion  ( T=  960"  C,  diffusion  time=4  hours,  and  thickness 
Fc-film=60nm),  we  repeated  the  procedure  of  diffusion  with  other  parameters  (T=1020"  C, 
diffusion  time  =  6  hours,  and  thickness  Fe-fihn  =  60  nm).  The  mechanism  of  the  photorefractive 
effect  in  the  Fe:LiNb03  structure  is  based  on  Fc^’^^ — >Fe^^  interconversion  and  the  space  charge 
set  up  by  the  action  of  light  is  attributed  to  a  charge  redistribution  between  divalent  and  trivalent 
impurities. 

After  ion  doping  under  reducing  conditions,  new  characteristic  absorption  bands  are  seen. 
As  result,  we  have  to  restrict  the  maximum  of  the  current  in  the  photoconductive  spectra.  For  the 
preparation  of  photoconductive  cell,  photolithographic  technique  is  utilized  for  making  A1 
interdigital  electrodes  pattern  on  the  clean  surface  of  IJNb03,  and  on  the  Fe:LiNb03  in-diffusion 

waveguide.  The  sizes  of  the  electrode  finger  and  gap  are  8  and  4  pm,  respectively.  These  devices 
are  used  to  examine  the  relation  between  the  photoconductive  current  and  the  light  wavelength. 
The  photorefractive  effect  involves  the  photoexcitation  of  mobile  carriers  from  impurity  centers 
(Fe^*  impurities)  to  the  conduction  band,  the  sensitivity  is  expected  to  be  strongly  correlated  with 
the  absorption  spectra  of  the  substrate  crystals  [4]. 

Two  methods  for  holographic  gratings  were  studied.  In  the  first  method,  a  classical  setup 
is  used.  The  substrate  with  the  waveguide  is  placed  at  the  intersection  of  two  coherent  light  beams. 
One  of  the  beams  may  be  considered  the  reference  beam,  and  the  other  the  signal  beam.  We 
assume  that  the  bisector  of  the  two  beams  is  normal  to  the  crystal  surface,  and  we  choose  the  z- 
axis  m  the  plane  of  incidence  normal  to  the  bisector.  Rcffering  to  Fig.  1  we  consider  a  scheme  of 
two-wave  mixing  for  the  study  of  photorefractive  dynamic  holograms.  An  input  optical  beam  is 
split  by  a  beam  splitter  into  a  pump  beam  and  a  signal  beam  which  interact  inside  a  photorefractive 
crystal.  In  this  case  the  energy  efficiency  is  defined  as  the  ratio  of  the  optical  power  of  the 
amplified  signal  beam  to  that  of  the  input  beam  in  the  Bragg  condition.  In  addition,  we  can 
measure  the  diffraction  efficiency  using  an  output  prism  of  the  waveguide. 

Typically  the  incident  power  in  each  beam  was  200m W,  and  the  gratings  were  written  in 
0.5  sec  exposure.  These  parameters  were  adjusted  to  produce  approximately  a  50:50  splitting 
ratio,  when  a  He-Ne  guided  wave  light  was  incident  at  the  appropriate  Bragg  angle. 

The  Bragg  angle  for  the  grating  was  given  by 
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nonwaveguide  bean 


photorefractive  cryst?  normal  mode  of  substrat 

Fig.l  Photorefractive  waveguide  for  optical  interconnects. 


^B=sin~^ 

2AN  (1) 

where  6^  denotes  the  Bragg  angle,  the  wavelength,  A  the  grating  pitch,  N  the  refractive  index 
of  the  guided  mode.  For  the  given  values  of  -  633nm,  A  -  1.4pm  andiV  =2.2241,  we  have 

the  Bragg  angle  6^=  6.2°  .  The  optical  beam  in  the  waveguide  was  reduced  to  approximately 

150pm  width  by  focusing  externally,  and  the  interaction  length  L  varied  from  1  to  2  mm.  The 
deflection  efficiency  is  defined  in  this  case  as  the  power  coupled  out  in  the  deflected  beam  divided 
by  the  sum  of  the  powers  coupled  out  in  the  deflected  and  the  transmitted  beams. 

Diffraction  efficiencies  for  the  guided  beams  were  as  high  as  8  %.  By  rotating  the  grating 
in  the  z-y  plane,  we  were  able  to  find  the  optimum  Bragg  condition.  In  each  case  of  measuring  the 
diffraction  efficiencies,  we  found  a  difference  between  the  calculated  and  measured  ones  because 
of  the  effect  of  spatial  dephasing  between  the  absorption  and  the  phase  gratings.  This  is  indeed 
true  for  most  photosensitive  materials  since  the  same  physical  or  chemical  processes  are  usually 
responsible  for  both  properties.  The  variation  of  the  diffraction  efficiency  was  approximately  12%. 

In  the  second  method,  we  have  used  an  acousto-optic  control.  The  grating  was  written  into 
waveguide.  In  our  experiment,  an  acousto-optic  Bragg  cell  is  utilized  at  910  MHz,  so  that  a  wide 
band  range  of  angles  from  2.2“  to  4.6°  .  This  allows  us  to  use  holographic  gratings  made  with 
small  phase  shifts  or  different  periods.  This  system  requires  synchronization  in  order  to  stop  the 
recording  of  the  second  grating  when  the  efficiency  reduces  to  its  minimum.  We  extended  the 
double-exposure  technique  to  the  case  of  time-averaging  exposure.  Extension  to  time-averaging 
holographic  exposure  simplifies  the  proposed  interconnection  system. 

4.  Optical  interconnection  applications 

The  implementation  of  a  large  space  switch  requires  the  interconnection  of  many  smaller 
switches  used  as  building  blocks.  For  point-to-point  networks,  the  interconnection  of  these 
building  blocks  to  construct  a  large  switching  system  can  be  done  with  a  Berns,  Cross,  banyan  or 
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mesh  interconnection  network  or  a  hypercube  network. 

For  optical  interconnection  applications,  the  signal  beam  (  in  the  first  method  )  is  expanded 
through  a  binary  matrix  to  carry  the  interconnection  pattern.  Depending  on  the  experimental 
configuration,  such  a  mask  can  be  used  to  realize  a  1-to-NxN  interconnection  or  NxN  cross-bar 
switch.  To  achieve  maximum  energy  efficiency,  we  need  to  match  the  beam  profile  spatially  at  the 
photorefractive  substrate.  The  decompositions  of  a  2-D  mesh  interconnection  network  and  tiie 
decompositions  of  an  8-node  binary  hypercube  interconnection  network  with  holographic 
implementation  are  considered. 

A  mesh  interconnection  network  is  used  to  communicate  to  four  nearest  neighbors  for  any 
node  in  the  array.  An  8  node  hypercube  network  can  be  decomposed  as  6  regular  interconnect 
patterns. 

It  is  clear  that  a  hologram  with  five  stored  gratings  can  easily  realized  a  mesh 
interconnection  network  of  any  size,  and  a  hologram  array  with  three  stored  gratings  in  each 
holographic  element  can  implement  the  8-node  hypercube  interconnection  network.  We  have 
generated  gratings  by  using  a  Fourier  transform  configuration  of  interconnection  patterns. 

5.  Conclusion 

We  have  described  holographic  interconnections  and  demonstrated  the  dynamic  formation 
of  holographic  gratings  in  photorefractive  double  Fe-in-diffusion  LiNb03  waveguides.  The  results 
we  have  obtained  shows  the  validity  of  the  holographic  techniquefor  fabricating  grating  in 
photorefractive  waveguides.We  have  proposed  the  double-exposure  technique  with  combining 
optical  and  acousto-optical  controls.  Combined  with  the  large  storage  capacity  available  with 
holographic  recording,  this  double-exposure  technique  could  be  suitable  for  the  optical 
implementation  of  learning  networks  and  some  types  of  interconnection  networks.  The  advantage 
of  holographic  interconnections  is  that  different  interconnection  networks  are  addressable  for  the 
same  multi-computer  systems. 
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Abstract.  This  paper  introduces  the  Computer  Aided  Design  (CAD)  approach  for 
partitioning  and  placement  in  designing  and  packaging  opto-electronic  systems  into  Opto- 
Electronic  Multichip  Modules  (OE  MCM).  We  will  first  discuss  the  design  issues  of  speed, 
power  dissipation,  area  and  fabrication  limits  in  optoelectronic  system  design  with  free 
space  optical  interconnects.  We  will  then  define  the  formulations  for  OE  MCM  partitioning 
and  chip  placement.  New  algorithms  are  described  for  optimizing  this  partitioning  based  on 
the  minimization  of  the  power  dissipation  and  placement  based  on  fabrication  limit.  The 
result  of  an  example  of  a  multistage  interconnect  network  is  given  and  more  than  50% 
reduction  in  power  consumption  and  the  maximum  interconnect  distance  can  be  achieved. 


1.  Introduction 

Free-space  optical  interconnections  (FSOI)  have  been  shown  to  have  speed  or  power 
advantage  over  electronic  interconnections  for  long  distance  interconnections[l].  By 
introducing  FSOI  into  new  MCM  designs,  it  is  possible  to  create  an  OptoElectronic  MCM 
(OE  MCM).  Hov;ever,  incorporating  optics  into  electronics  presents  new  challenges  in 
computer-aided  design,  fabrication,  and  packaging.  To  integrate  electronics  and  optics  at 


the  system  level,  we  need  new  partitioning 


Figure  1.  Schematic  showing  of  a  physical  mode! 
of  OE  MCM  package  in  reflective  configuration. 
Only  a  few  of  the  many  actual  interconnects  are 


MCM  Package  OE  Chip 

Figure  2.  Cross-Section  of  an  Opto-Electronic 
MCM  package. 


and  placement  algorithms.  In  this  paper,  we 
will  try  to  address  these  important  CAD 
issues. 

Figure  1  shows  a  physical  model  of  an 
OE  MCM.  There  are  three  layers  in  the 
packaged  system.  The  lower  layer  consists  of 
multiple  chips  of  electronics  and 
optoelectronic  devices  on  MCM  substrate. 
Each  chip  contains  a  number  of  switching 
elements  (SEs)  or  processing  elements  (PEs). 
The  middle  layer  consists  of  diffractive 
optics  (DDEs)  or  computer  generated 
holograms  (CGHs).  The  upper  layer  is  a 
mirror.  The  cross  section  of  this  system  is 
shown  in  Figure  2.  The  optical  transmitters, 
e.g.  surface  emitting  lasers,  illuminate  an  off- 
axis  lens  where  the  beam  is  collimated  and 
deflected  into  the  desired  direction.  The 
beam  then  reflects  off  a  mirror  and  is  focused 
by  another  off-axis  lens  onto  a  receiver 
(detector)  on  another  chip. 

There  are  four  aspects  of  optoelectronic 
MCM  design  that  are  key  to  the  system 
perfomiance:  interconnection  speed,  heat 
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dissipation,  chip  size  and  MCM  size,  and  maximum  optical  interconnect  distance. 
Currently,  optical  sources  (such  as  modulators  or  laser  diodes  and  their  drivers)  generate 
heat  in  densities  much  higher  than  that  of  VLSI.  Based  on  speed,  switching  energies,  and 
power  budgets,  electrical  and  optical  interconnect  technologies  can  be  compared  and  a 
break-even  interconnection  length,  say  Ibe,  can  be  defined[l].  When  Ibe  is  approximately 
the  lateral  size  of  the  chips,  optical  interconnection  technology  is  preferred  for  chip-to-chip 
interconnects  where  the  signals  must  propagate  distances  longer  than  Ibc-  For  interconnects 
among  SEs  within  chips  where  signal  propagating  distances  is  less  than  Ibe,  electronic 
interconnection  technology  is  preferred.  It  is  also  known  that  the  minimum  feature  size  of 
the  fabrication  technology  limits  the  maximum  interconnect  distance  [2].  Under  these 
technological  constraints,  partitioning  and  chip  placement  can  be  optimizd  by  CAD 
methods. 


2.  System  Partitioning  of  OE  MCM 

Optoelectronic  system  partitioning  is  performed:  (a)  to  divide  the  total  number  of  SEs  in 
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Figure  3.  The  straight  forward  partitioning  of  a  twin-butterfly 
interconnection  network  into  a  4x5  OE  MCM  (left).  The  numbers  of  optical 


interconnects  on  each  chip  are  listed  (right).  The  pw[i]  is  the  number  of 
optical  interconnects  for  the  ith  chip.  The  1st  index  of  each  item  is  the  stage 
index,  and  the  2nd  index  is  the  SE  index  in  the  stage. 


the  system  into  fewer 
number  of  chips;  and 
(b)  to  choose  the  best 
technologies  for  each 
interconnect  required  by 
the  netlist  of  a  certain 
computing  architecture, 
so  that  one  of  the 
system  performance 
parameter  will  be 
minimized  while  the 
remaining  three 

performance  parameters 
mentioned  above  be 
constrained.  For 

example,  given  a  total 
of  N  SEs,  the  task  is  to 
divide  the  N  SEs  into 
the  M  partitions  so  that 
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Figure  4.  The  algorithm  partitioning  of  a  twin-butterfly  interconnection 
network  into  a  4x5  OE  MCM  (left).  The  numbers  of  optical  interconnects  on 


the  constraints  are 
satisfied  and  the  goal  is 
achieved.  The  size 
constraint  will  be 
defined  as  NC  SEs  per 
chip.  The  connection  is 
defined  by  the  netlist  of 
the  system.  Within  the 
chips,  connections  will 
be  electrical,  while 
among  chips  the 
interconnection  will  be 
optical.  The  goal  is  to 
minimize  the  power 
consumed  by  the  whole 
system,  including  both 


each  chip  are  listed  (right). 


optoelectronic  chips  and 
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No/iSota!  optical  interconnects. 

^  T  ^ - g  Yo  solve  this  OE  MCM  partition  problem,  we 

present  an  algorithm  based  on  a  modified  weighted 
.6  -  /  matching  algorithm,  which  is  based  on  labeling 

A-.  Algorithm  ^ _ ,  techniques[3],  and  the  Burkard's  heuristic[4]  adjusted  to 

2  solve  our  optoelectronic  MCM  partitioning  problem. 

^  _ _ _  The  modified  weighted  matching  algorithm  is  used  here 

Q  5  10  15  20  minimize  the  power  consumption.  By  virtue  of  using 

No.  of  Chips  ^  matching  algorithm,  the  size  constraint  is 

Figure  5.  Comparison  of  the  power  automatically  satisfied.  The  optimized  partitioning  is 
dissipation  of  the  interconnects  achieved  by  rearranging  the  PEs  among  different  chips 
between  random  partitioning  and  to  reduce  the  number  of  interchip  interconnections,  and 
algorithm  partitioning.  therefore  to  reduce  the  power  dissipation.  To  evaluate 

the  algorithm,  we  compare  the  result  with  that  of  a  straight  forward  partitioning  which 
follows  2-D  raster  ordering  inside  each  chip  (as  shown  in  Figure  3  for  a  20  chip  MCM  with 
a  maximum  25  SEs/chip).  Figure  4  shows  the  result  for  the  same  setup  done  by  the 
algorithm.  The  hottest  chip  in  the  1st  case  dissipates  2  times  more  heat  than  the  2nd  case. 
For  various  numbers  of  chips,  Figure  5  gives  a  comparison  of  the  results  of  random 
partitioning  with  the  results  of  the  CAD  algorithm  on  an  irregular  interconnection  network 
example[5].  A  reduction  in  power  of  over  50%  is  achieved  in  many  of  the  cases  studied[6]. 
3.  SE  Placement  in  OE  MCM 

In  the  OE  MCM  design,  placement  is  the  design  phase  in  which  the  exact  physical 
location  for  each  SE  in  the  chips  is  determined.  There  are  many  ways  to  determine  the 
placement  of  a  system.  For  example,  one  can  select  the  location  of  each  SE  by  random 
ordering  which  we  call  a  random  placement,  or  by  the  natural  2-D  raster  ordering  which 
we  call  a  straight-forward  placement.  A  good  placement  can  result  in  reduced  fabrication 
requirements  for  the  CGHs  in  allowing  larger  OE  system  to  be  layout  and  built  for  the 
same  fabrication  requirement  and  cost.  Since  the  fabrication  technology  increases  with 
maximum  interconnect  distance,  it  is  the  maximum  interconnection  distance  that  should  be 
minimized  in  the  OE  system  placement. 

straight  forward  placement  alflOrKhmlC-PlaCglUgnl  For  CXamplC,  in  thC 

'^"ll  case  of  placeing  a 

^  I  ■  ^  1 1 1  multistage  intercon- 

»o  I  I  _  ^  I  I  ■  nection  network  onto  an 

"III  .  :::  llljl  OE  MCM  where  each 

_  “^im  .  .  stage  with  many  SEs  are 

T  .  T  T  T  5  T  T  T  ,  T  T  T  T  T  »  ,  ,  placed  on  to  one  chip,  an 

InlarconnocI  distances  Interconnect  distances  itCratiVC  matChlnS 

(a)  (b)  .  ® 

Figure  6.  Comparison  between  two  placement  methods,  (a)  Result  algorithm  can  be  used[5]. 

of  the  suaight  forward  method,  (b)  Result  after  applying  the  The  physical  model  of 

placement  algorithm.  The  maximum  interconnect  distance  is  reduced  such  case  has  been  shown 

in  Figure  1.  The 


straight  forward  placement  algorithmic  plagginenl 

[/nfs  Coofl/S 


Interconnect  distances 


Interconnect  distances 


Figure  6.  Comparison  between  two  placement  methods,  (a)  Result  aigoninm  can  oe  usea[:)j. 

of  the  suaight  forward  method,  (b)  Result  after  applying  the  The  physical  model  of 

placement  algorithm.  The  maximum  interconnect  distance  is  reduced  such  case  has  been  shown 

in  Figure  1.  The 

algorithm  starts  with  a  random  placement  of  SEs  within  chips  and  reduces  the  maximum 
interconnection  distance  through  a  fixed  number  of  iterations  by  rearranging  PEs  within 
chips.  For  example.  Figure  6  shows  the  histogram  results  of  the  straight-forward 
placement  and  of  the  placement  by  the  iterative  matching  algorithm  for  comparison,  for  the 
same  network  example  used  in  section  5.  Figure  7  shows  the  actural  layouts  of  the  1st 
stage  of  the  interconnection  network.  Figure  7(a)  is  the  result  of  the  straight-forward 
placement  method,  and  Figure  7(b)  is  the  result  of  the  algorithm  placement.  In  both  cases, 
the  1st  stage  is  the  worst-case  for  the  whole  network,  and  the  worst  case  interconnects  are 
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Figure  7,  2-buttcrfly  OE  MCM.  Only  the  1st  stage  is  shown  here,  (a) 
Straight-Forward  Placement.  The  connections  for  node  22,23,30  and  31  are 
marked.  Worst  case  is  8.602  from  23  to  56.  (b)  Algorithm  Placement.  The 
connections  for  node  49,  39,  20  and  5  are  marked.  The  worst  case  is  4.24 
from  49  to  49.  The  improvement  is  about  50%. 


marked  down.  More 
than  a  50%  reduction  in 
the  maximum  intercon¬ 
nect  distance  (or  on  the 
packaging  volume  of 
the  system)  can  be 
achieved  when  we 
applied  the  algorithms 
to  the  OE-MCM  design. 
5.  Conclusion. 

In  this  paper,  we 


discussed 

optoelectronic  MCM 

models  and  CAD 

algorithms  for 

partitioning  and 

placement.  Results 


show  that  more  than 
50%  reduction  in  power 
consumption  and  the 
maximum  interconnect 
distance  can  be 
achieved  by  applying 
these  CAD  algorithms 
to  an  OE  MCM  design. 
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Abstract:  In  this  paper,  the  design  and  implementation  of  a  high  speed  optical  ring  topology  based  free 
space  optical  interconnect  is  described.  This  interconnect  system  operates  at  500  MHz  and  consists  of  16 
laser  transmitters,  a  four  channel  free  space  interconnect,  and  a  fast  speed  receiver.  A  Nearest  Neighbor 
(NN)  interconnect  has  been  successfully  demonstrated.  At  the  data  rate  of  500  MHz,  the  total  system 
throughput  is  8  Gbps.  The  system  can  easily  be  operated  at  much  higher  data  rates  since  the  rate  was  only 
limited  by  the  electronic  circuitry.  This  interconnect  is  very  promising  in  the  implementation  of  ultra  fast 
massively  parallel  SIMD  machines. 


L  Introduction 

Electronic  interconnections  have  been  recognized  as  the  bottlenecks  of  high  performance 
computing  systems.  Because  of  their  3-D  nature  and  their  matched  impedance,  optical  free  space 
interconnects  are  the  best  alternative  to  electronic  counterparts.  Massively  Parallel  Machines 
(MPMs)  require  that  the  interconnect  latency  between  processors  be  kept  as  small  as  possible.  For 
this  purpose,  we  have  developed  optical  interconnects  using  ring  topologies  [1].  This  novel 
interconnect  arranges  the  processors  on  a  ring  instead  of  a  rectangular  grid  configuration  used  on 
most  of  current  systems.  It  relies  on  imaging,  uses  only  spatial  invariant  elements  and  completely 
eliminates  the  boundary  effect  which  exists  in  systems  based  on  rectangular  topologies. 
Operations  required  by  the  ring  topologies  can  be  implemented  with  simple  image  rotations 
using  Dove  prisms.  Because  of  its  interconnection  regularity  schemes,  the  proposed  interconnect 
is  appropriate  for  inter-processor  communications  of  Single  Instruction  Multiple  Data-stream 
(SIMD)  machines  [2].  As  a  continuation  of  our  previous  work,  here  we  present  the  experimental 
implementation  of  a  500  MHz  free  space  interconnect  based  on  the  ring  topology.  Section  2 
briefly  describes  the  ring  topology.  Section  3  describes  the  system  setup,  while  the  experimental 
results  are  shown  in  Section  4. 

2  Ring  topology 

The  principle  of  a  Nearest  Neighbor  ring  topology  is  described  here.  Assuming  an  array  of  N^nxn 
processors  in  a  two-dimensional  plane  as  shown  in  Fig.  1(a).  The  NN  interconnect  can 
mathematically  be  expressed  by 
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NN+,(i)  =  i+l, 

(1) 

NN  ,(i)  =  i+l. 

(2) 

NN,.,(i)  =  i+l, 

(3) 

NN.„(i)  =  i+l. 

(4) 

This  is  a  variation  of  a  mesh  architecture  that  allows  wraparound  connections.  The  major 
problem  is  that  the  interconnection  length  between  processors  is  not  the  same  on  the  boundary  as 
in  the  central  region.  This  unequal  interconnection  length  will  cause  system  latency.  The  problem 
becomes  severe  when  the  total  number  of  processors  is  increased.  By  arranging  the  processors  in 
a  ring  as  shown  in  Fig.  1(b),  this  boundary  effect  can  be  eliminated.  Other  rectangular  grid 
topologies  such  as  PM2i  and  Hypercube[3],  can  be  implemented  in  a  similar  fashion.  In  addition, 
operations  required  by  the  ring  topologies  can  be  implemented  by  performing  simple  image 
rotations. 

3.  Optical  system  design 

For  the  implementation  of  the  ring  topology  based  interconnect  system,  we  imposed  the 
following  constraints. 

1 .  Keep  system  complexity  down  by  minimizing  the  number  of  elements. 

2.  Only  rotationally  invariant  optical  elements  were  allowed. 

3.  Maintain  a  constant  high  interprocessor  communication  rate. 

4.  Maintain  identical  latency. 

The  schematic  of  the  experimental  system  is  shown  in  Fig. 2.  It  consists  of  a  data 
generator,  an  input  ring  assembly,  a  four-channel  free  space  interconnect,  and  an  output  ring 
assembly.  The  input  ring  assembly  consists  of  16  laser  transmitters;  each  being  individually 
modulated  by  a  500  MHz  PN  data  sequence  generated  by  a  data  generator.  In  the  four  channel 
interconnect,  each  channel  is  designed  to  perform  one  of  the  four  operations  required  by  the  NN 
interconnect.  DP1-DP4  are  Dove  prisms.  Each  Dove  prism  is  prerotated  to  a  direction  to  perform 
the  required  imaging  rotation  operation.  To  demonstrate  channel  switching  operations,  a 
polarization  mechanism  is  applied.  BS1-BS6  are  polarizing  beamsplitters.  Liquid  Crystal  (LC) 
switches  (sl-s5)  are  employed  in  each  channel.  The  selection  of  the  interconnect  channel  can  be 
done  by  controlling  the  settings  of  the  LC  switches.  A  single  GaAs  receiver  is  employed  at  the 
position  of  the  output  ring  assembly.  Data  sequences  for  the  four  interconnect  channels  are 
received  sequentially  by  setting  the  proper  state  of  each  LC  switch.  The  received  data  sequence 
is  then  compared  with  that  of  the  input. 

4,  Experimental  Results 

We  have  used  the  above  system  design  to  implement  the  NN  interconnect.  As  mentioned  above, 
the  implementation  of  the  NN  interconnect  requires  four  different  operations:  i+1,  i+4,  i-1,  and 
i-4  connections.  To  perform  these  operations,  the  Dove  prisms  in  all  four  channels  were  pre¬ 
rotated  to  certain  angles  so  that  each  channel  could  realize  one  of  the  four  operations  required 
by  the  NN  interconnect.  In  the  experiment,  channels  1  through  4  were  assigned  to  perform  the 
\+4,  i+1,  i-4,  and  i-1  connections,  respectively.  The  data  generator  was  set  to  generate  a  500 
MHz  PN  data  sequence.  During  the  experiment,  set  the  LC  switches  so  that  channel  1  to  channel 
4  were  sequentially  selected.  The  experimental  results  are  shown  in  Fig. 3.  The  experimental 
results  showed  that  the  data  sequences  corresponding  to  the  NN  interconnect  were  received 
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correctly  for  all  16  transmitters.  We  also  demonstrated  that  other  topologies  such  as  PM2i  and 
Hypercube  can  be  implemented  by  performing  different  rotation  operations.  This  novel 
interconnect,  using  space  invariant  optical  elements,  having  identical  interconnect  latency,  and 
allowing  network  reconfigurability,  will  be  suitable  for  high  speed  Massively  Parallel  Machines 
such  as  SIMD  machines. 
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Fig.  1  (a)  Nearest  Neighbor  (NN)  interconnect  configured  on  a  rectangular  grid, 
(b)  The  same  interconnect  but  with  a  ring  topology. 


Fig.3  Experimental  results  of  the  Nearest  Neighbor  interconnect.  The  pulse  width  is  2  ns.  (a) 
corresponds  to  i+4  interconnect,  (b)  corresponds  to  i+1  interconnect;  (c)  corresponds  to  i-4 
interconnect;  (d)  corresponds  to  i-1  interconnect.. 
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Abstract.  Spatial  light  modulator  based  routing  systems  offer  potential  advantages 
over  conventional  electronic  and  waveguide-based  systems.  Integration  of  ferroelec¬ 
tric  liquid  crystal  over  silicon  smart  pixel  modulators  permits  high  levels  of  switch 
parallelism  to  be  achieved. 


1.  Introduction 

Free-space  optical  interconnection  enables  much  greater  fan-out  and  fan-in  than  can  be 
accessed  by  electronics.  These  features  make  it  appropriate  to  re-examine  single-stage 
crossbar  architectures.  The  purpose  of  this  review  is  to  summarise  the  technology  of 
spatial  light  modulator  routing  systems  and  to  consider  scalability,  control  and  arbitration 
issues. 


2.  Device  technology 

The  spatial  light  modulators  (SLMs)  in  the  systems  proposed  here  are  intended  to  be 
silicon  integrated  circuits  with  an  overlay  of  chiral  smectic  liquid  crystal  [Ij.  Very  large 
electro-optic  effects  are  observed  in  these  birefringent  ferroelectric  materials  -  on  inverting 
the  polarity  of  the  voltage  across  the  liquid  crystal  layer  (whose  thickness  is  normally  2^m 
or  less)  the  crystal  optic  axis  rotates  through  an  angle  29,  which  may  vary  from  a  few 
degrees  to  close  to  90°.  The  liquid  crystal  switching  speed  varies  strongly  with  this  angle. 

In  reflective  devices  built  over  silicon  die,  the  factor  limiting  the  overall  attenuation 
ratio  is  the  quality  of  the  aluminium  pads  that  form  mirrors  on  the  silicon  die.  Efforts 
are  underway  to  improve  this  and  to  increase  the  optical  quality  of  the  modulators  to 
approach  that  of  devices  in  all-glass  cells  [2].  Attenuation  ratios  of  several  100  have  been 
measured  for  silicon  backplane  devices  (compared  with  values  up  to  2000  for  glass  cells), 
and  this  is  expected  to  increase  with  improvements  in  device  technology  and  planarisation. 

Currently  optimised  materials  [3]  with  switching  angles  of  approaching  45°  and  wide 
phase  temperature  ranges  are  approaching  lOjis  switching  speeds  at  voltages  around  lOV 
at  45°C,  figure  1.  In  addition,  very  high  switching  angle  materials  (~  90°)  such  as  Chisso 
2004,  can  now  provide  polarization  insensitive  phase  modulation  of  the  input  light,  by 
removing  the  need  for  polarisers  either  side  of  the  cell.  However  smaller  switching  angles 
(perhaps  only  a  few  degrees)  and  thus  higher  switching  speeds  may  be  useful  in  some 
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Figure  1:  Switching  time  vs.  voltage  at  45”C  for  material  DRA2 


ipplications.  In  such  devices  it  seems  likely  that  switching  speeds  of  the  order  of  100ns 
vill  be  possible. 

The  very  large  electro-optic  effects  that  are  observed  enable  high  fan-out/fan-in 
witch  structures  to  be  constructed,  and  the  electronic  functionality  of  the  silicon  back- 
dane,  capable  of  electronically  controlled  mirrors  and  sensitive  photodetectors,  forms  a 
powerful  ‘smart  pixel’  technology. 


3.  Basic  Switch  Architectures 

3.1.  Matrix-matrix  crossbar 

An  optical  vector-matrix  processor  for  fast  Fourier  transform  calculations  was  proposed  in 
1978  [4],  later  identified  as  a  single-stage,  non-blocking  space-switch,  and  then  extended  to 
the  general  matrix-matrix  crossbar  [5].  These  architectures  passively  fan-out  each  optical 
input  towards  every  output.  The  replications  of  the  inputs  are  then  ‘shadowed’  by  means 
of  a  reconfigurable  shutter  array  and  hence  selective  fan-in  is  achieved  at  the  output  plane, 
figure  2. 

Electrically  addressed  liquid  crystal  SLMs  operating  in  a  binary  transmission  mode 
are  suitable  high-speed  switching  arrays  that  can  be  used  as  the  shutter  plane.  Each  input 
replication  must  be  optically  resolved  through  a  single  shutter  such  that  any  arbitrary  in¬ 
terconnection  pattern  may  be  formed,  including  broadcast,  multicast  and  multiple-input 
fan-in.  Reconfiguration  of  the  switch  simply  involves  closing  any  (single)  shutters  corre¬ 
sponding  to  completed  calls  and  opening  any  new  paths  required. 

The  intrinsic  replication  of  the  optical  inputs  which  is  associated  with  this  crossbar 
leads  to  a  power  loss  of  jj  per  input,  given  7V-to-A  routing.  However  by  imaging  the 
replications  into  the  shutter  plane,  very  little  of  the  SLM  needs  to  be  optically  active, 
allowing  control  and  arbitration  circuitry  to  be  easily  placed  around  the  pixels.  A  passive 
diffractive  phase  plate  or  hologram  is  well  suited  to  performing  this  fan-out  operation. 
Fan-in  can  then  be  achieved  using  (multiple)  lens  arrays  which  could  also  be  diffractive. 

The  argument  of  reciprocity,  leads  to  the  conclusion  that  if  the  input  and  output 
devices  have  the  same  numerical  aperture,  there  must  also  exist  a  fan-in  power  loss  in 
addition  to  the  fan-out  replication  loss.  An  optically  transparent  matrix-matrix  crossbar 
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Figure  2:  Matrix-matrix  crossbar  concept  Figure  3:  Holographic  crossbar 


with  N  inputs,  placed  in  a  single-mode  fibre  network,  therefore  exhibits  a  ^  total  power 
loss  per  input,  plus  any  additional  loss  due  to  the  optics,  and  will  probably  require  pre-  and 
post-amplification  stages  using  optical  amplifiers.  However  the  fan-in  loss  may  be  avoided 
by  using  high  numerical  aperture  devices  at  the  output  such  as  multi-mode  fibres. 

Due  to  crosstalk  effects  caused  by  the  incomplete  extinction  of  light  through  those 
pixels  which  are  ‘off’,  the  output  signal-to-noise  (SNR)  ratio  and  hence  the  switch  seal- 
ability  is  essentially  determined  by  the  contrast  ratio,  C,  of  the  liquid  crystal  shutters. 
Given  N  inputs,  the  SNR  at  any  output  port  is  of  the  magnitude 

3.2.  Dynamic  holographic  crossbar 

The  generation  and  use  of  computer-generated  holograms  is  well  documented,  e.g.,  [6]. 
Such  holograms  are  usually  the  coarsely  quantized  representations  of  the  sampled  2-D 
Fourier  transform  of  a  certain  desired  output  image  or  spot.  The  principle  of  operation 
of  the  holographic  crossbar  is  the  use  of  holograms  to  deflect  as  much  optical  power 
as  possible  from  the  inputs  to  the  outputs  by  eliminating  the  initial  fan-out  operation 
associated  with  the  generic  matrix-matrix  architectures.  Thus  the  interconnect  plane  is 
divided  into  N  routing  areas  and  each  routing  area  is  filled  with  one  from  a  set  of  base 
holograms  which  are  possibly  stored  in  a  non-volatile  memory  behind  the  interconnect 
plane,  figure  3.  Each  hologram  acts  as  an  independent  diffraction  grating  and  has  an 
associated  quantization  noise  limited  diffraction  efficiency,  77,  which  will  generally  be  less 
than  unity  [7].  Input  broadcasting  or  multicasting  may  be  achieved  simply  by  designing 
a  routing  hologram  to  produce  more  than  one  output  peak.  The  holograms  are  spatially 
invariant  so  fan-in  from  multiple  inputs  to  a  single  output  may  also  be  achieved  by  placing 
the  same  hologram  in  more  than  one  routing  area. 

An  electrically  addressed  SLM  operating  in  a  binary  phase  mode  produces  a  more 
efficient  system  than  an  amplitude  mode  device  because  the  zero-order  (i.e.,  non-diffracted 
path)  can  be  eliminated  in  the  output  plane  by  ensuring  there  are  equal  pixel  numbers  of 
each  phase  state.  However  both  configurations  lead  to  a  redundant  rotational  symmetry 
in  the  output  plane  because  of  the  binary  nature  of  the  Fourier  plane,  which  can  only  be 
removed  by  utilisation  of  more  than  two  phase  levels  in  the  hologram. 

A  key  scaling  issue  of  the  holographic  interconnect  is  the  number  of  pixels  required 
per  routing  hologram  to  provide  acceptable  noise  characteristics.  Discrete  Fourier  trans- 
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Figure  4:  Signal-tonoise  ratio  at  a  single  output  port 

forms  have  the  property  that  to  be  able  to  resolve  the  N  output  ports,  each  routing  holo¬ 
gram  must  contain  mN  pixels,  where  tu  >  2  because  of  the  binary  redundant  symmetry. 
As  an  approximation,  if  we  assume  that  tti  is  sufficiently  large  that  the  Gaussian  tails  of 
the  output  peaks  do  not  coincide  and  that  the  diffraction  noise  is  uniformly  distributed, 
the  SNR  at  any  output  port  will  be  of  the  order  of  As  with  the  matrix-matrix 

crossbar,  the  holographic  switch  exhibits  a  ~  SNR  dependency,  figure  4.  The  curves 
shown  are  for  44  x  44  pixel  routing  areas,  compared  to  a  matrix-matrix  SNR  curve  with 
a  contrast  ratio  of  1000. 
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4.  Data  throughput,  switch  control  and  arbitration 

Free-space  photonic  switches  based  on  ferroelectric  liquid  crystals  offer  high  levels  of  par¬ 
allelism  and  reconfiguration  times  may  approach  a  microsecond.  For  optically  transparent 
fibre  optic  switches  the  data  throughput  is  determined  by  the  optical  power  budget  and 
can  be  very  high.  The  reconfiguration  time  is  considerably  longer  than  the  bit  period  of 
the  data  streams,  but  this  is  acceptable  in  many  applications  and  any  potential  data  loss 
during  switch-over  can  be  either  accepted  or  can  be  avoided  by  scheduling  the  reconfigu¬ 
ration. 

The  routing  algorithms  for  non-blocking  single-stage  switches  are  simple  compared 
with  those  for  multi-stage  blocking  switches  since  all  paths  through  the  switch  are  mu¬ 
tually  independent.  For  the  switches  based  on  shuttering  light  beams  it  is  simply  nec¬ 
essary  to  open  the  shutter  corresponding  to  the  requested  path  through  the  switch.  For 
holographic  routing,  once  the  path  is  requested,  the  required  hologram  pixel  pattern  is 
ascertained  via  a  look-up  table,  and  the  control  circuitry  writes  this  pattern  to  the  pixel 
array.  There  is  such  a  hologram  for  every  path  through  the  switch.  In  both  cases  there 
is  a  group  of  pixels  onto  which  the  appropriate  pattern  must  be  written.  For  the  matrix- 
matrix  switch  this  is  the  fan-in  group  serving  each  output  port.  For  the  holographic  switch 
it  is  the  array  of  pixels  serving  each  input  port.  An  objective  would  be  that  the  speed  of 
arbitration  and  addressing  via  the  silicon  backplane  should  allow  this  to  be  accomplished 
in  a  time  that  is  smaller  than  the  switch  reconfiguration  time. 

Control  information  must  be  communicated  to  the  SLM  in  the  interconnect  plane  of 
the  switch.  This  information  may  be  separated  from  the  data  paths  through  the  switch. 
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Figure  5:  Overlay  control  network 


Figure  6:  Embedded  control  network 

carried  on  an  overlay  electronic  or  optical  network  (depending  on  the  switching  rate) 
linking  the  switch  inputs  and  the  SLM,  figure  5,  or  it  may  embedded  in  the  main  data 
paths  using  the  existing  free-space  optics  and  photo-detectors  integrated  on  the  SLM 
silicon  backplane,  figure  6. 

In  a  space  switching  context,  the  two  types  of  free-space  photonic  switches  discussed 
here  are  strictly  non-blocking  single-stage  designs.  The  detection  of  contention  external  to 
the  switch  and  the  taking  of  the  steps  necessary  to  stop  multiple  channels  being  directed 
towards  the  same  output  port  are  also  considerably  simplified  in  these  structures  (as 
opposed  to  multistage  switches).  Arbitration  facilities  for  resolving  contention  can  be 
built  directly  into  the  switch  control  electronics,  and  special  circuitry  to  accomplish  this 
can  be  placed  in  the  silicon  backplane  of  the  interconnect  SLM. 

In  a  matrix-matrix  switch,  contention  is  avoided  by  simply  only  allov.dng  one  pixel 
to  be  opened  in  each  ‘fan-out’  or  ‘fan-in’  group.  In  this  case,  major  simplifications  to 
both  arbitration  and  switch  control  result  from  the  fact  that  there  is  only  one  unique 
path  between  switch  input  and  switch  output  rather  than  a  choice  of  paths  as  there  are 
in  multistage  switches,  and  each  such  path  is  associated  with  only  a  single  electronically 
accessible  smart  pixel. 

In  dynamic  hologram  switches,  collision  detection  is  best  situated  at  the  look-up 
table  where  all  routing  requests  are  received.  This  facility  could  perhaps  be  situated  on 
the  interconnect  SLM,  however  separating  the  data  and  control  paths  and  using  an  overlay 
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network  for  control  is  more  likely  in  this  case  because  of  the  extra  difficulty  in  free-space 
communication  with  the  interconnect  SLM  which  is  in  a  Fourier  plane. 


5.  Conclusions 

The  data  throughput  of  optically  transparent  fibre  optic  switches  is  determined  by  the 
optical  power  budget  and  can  be  very  high.  The  use  of  smart  pixel  systems  therefore 
provides  a  viable  switch  technology  for  connection  of  at  least  several  hundred  users.  Both 
systems  display  good  SNR  and  in  principle  could  be  made  lossless  and/or  polarisation 
insensitive  through  use  of  appropriate  materials. 

The  holographic  architecture  could  potentially  form  a  very  low  loss  1:1  interconnec¬ 
tion  switch.  The  output  SNR  is  dependent  on  the  number  of  hologram  pixels  used  in  each 
routing  area  and  can  be  very  high.  The.  matrix-matrix  switch  requires  fewer  pixels  and 
may  operate  faster  because  each  pixel  represents  a  connection  path  through  the  system, 
rather  than  a  whole  holographic  image.  The  SNK  is  limited  by  the  contrast  ratio  of  the 
liquid  crystal  shutters. 

Using  FELC/VLST  technology,  control  and  arbitration  electronics  can  be  associated 
with  either  SLM  pixels  in  matrix-matrix  switches,  or  arrays  of  pixels  for  holographic 
architectures,  resulting  in  ‘smart  pixel’  or  ‘smart  hologram’  systems. 
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Abstract  A  free-space  single-stage  optical  crossbar  switch  has  been  designed 
and  fabricated.  A  liquid  crystal  on  silicon  SLM  is  used  as  the  active 
re-routing  element.  The  initial  results  of  the  switch  components  are  reported 
indicating  the  scalability  of  this  technology. 

1  Introduction 

The  use  of  free-space  optics  to  perform  high  bandwidth  connections  for  computing  and 
communications  applications  has  been  widely  studied  over  the  recent  years  [1][2].  In 
this  paper  we  present  the  initial  practical  results  from  a  free-space,  single-stage,  optical 
crossbar  switch  which  has  been  developed  as  part  of  a  collaborative  project  entitled 
OCPM  (Optically  Connected  Parallel  Machines).  Previous  work  from  this  project  has 
been  reported  elsewhere  [3];  here  we  present  details  of  a  64x64  crossbar  switch.  The 
particular  area  that  this  switch  addresses  is  where  a  high  data  bandwidth  (>lGbit/s)  and 
large  number  of  nodes  (>16)  are  required  together  with  a  medium  length  reconfiguration 
latency  (~10/xs). 

The  origins  of  the  optical  crossbar  switch  using  the  vector-matrix  multiplier  architec¬ 
ture  have  been  detailed  previously  [4]  and  a  number  of  designs  have  been  described.  The 
performance  and  compactness  of  space  switches  of  this  type  are  enhanced  by  adopting 
a  two  dimensional  crossbar  architecture  [5]  as  shown  in  figure  1.  This  matrix-matrix 
structure  can  be  implemented  using  free-space  optics,  based  on  binary  phase  gratings 
and  spatial  light  modulators  (SLMs),  based  on  ferroelectric  liquid  crystals  integrated 
with  silicon  VLSI  (FELC/VLSI)[6].  These  SLMs  allow  large  arrays  of  shutters  (or  opti¬ 
cal  crosspoints)  to  be  made,  each  with  a  large  contrast  ratio.  For  switches  with  modest 
reconfiguration  times,  these  features  remove  the  main  motivations  for  using  multi-stage 
re-arrangeable  networks  and  may  allow  switch  control  functions  to  be  simplified  by  the 
use  of  the  single-stage  crossbar  architecture.  The  control  and  arbitration  functions  can 
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be  integrated  into  the  silicon  VLSI  backplane  of  the  crosspoint  SLM  to  form  a  compact 
and  eventually  self-routing  system. 

If  intrinsic  fan-in  loss  is  to  be  avoided,  the  collection  channel  must  have  a  greater 
numerical  aperture  (N.A.)  and/or  area  than  that  of  the  input  channels.  If  the  input  is 
from  N  mono-mode  fibres,  then  the  output  could  be  collected  into  a  large  cross-section 
multi-mode  fibre  that  supports  at  least  N  modes.  By  totally  filling  the  N.A.  of  the  multi- 
mode  fibre  the  number  of  input  channels  which  may  be  used  before  intrinsic  fan-in  loss 
occurs  is  given  by: 

where  ri  is  the  radius  of  the  beam  emitted  from  the  mono-mode 
input  fibres  and  r2  is  the  beam  radius  accepted  by  the  multi-mode 
output  fibre.  For  mono-mode  fibres  (N.A.)iri  == 


2  Description  of  64x64  switch 


33(0mm 


Output  Fibres 


Figure  1:  Plan  of  the  64x64  crossbar  baseplate 


Figure  1  shows  the  layout  of  the  64  input,  64  output  optical  system  developed.  A 
matrix-matrix  crossbar  design  has  been  used  and  data  inputs  to  the  switch  are  provided 
by  790nm  laser  diodes  pigtailed  to  polarisation  preserving  fibres.  To  create  the  most 
compact  system,  the  design  uses  one-to-one  imaging  and  hence  the  input  fibres  are 
arranged  in  an  array  that  matches  the  pixel  spacing  of  the  SLM.  These  are  80/im  cladding 
diameter  polarisation  preserving  fibres  which  are  arranged  in  an  8x8  square  array  with  a 
120//m  pitch.  Each  fibre  has  its  polarisation  axes  orientated  in  the  same  direction.  This 
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fibre  array  was  fabricated  in  an  array  of  holes  formed  by  excimer  laser  material  ablation 
of  a  Kevlar  substrate  [7]. 

To  obtain  a  compact  and  rugged  system,  the  optical  elements  have  been  mounted  on  a 
baseplate  [8]  which  allows  fine  alignment  of  the  system.  The  light  from  the  input  array  is 
fanned  out  by  means  of  a  binary  phase  grating  based  on  a  non-separable  two-dimensional 
design.  A  half-wave  plate  is  used  to  align  the  polarisation  of  the  channels  with  the  liquid 
crystal  axes  of  the  SLM.  The  reflected  outputs  of  the  SLM  are  then  re-imaged  onto  the 
holographic  fan-in  array  which  focusses  the  light  into  the  multi-mode  output  fibres.  The 
output  fibres  are  300/im  core  diameter,  step-index  with  an  N.A.  of  0.22  arranged  to  form 
an  8x8  square  on  a  960/:im  pitch.  The  substrate  used  for  this  array  is  two  layers  of  silicon 
with  arrays  of  wet-etched  holes  [7].  Putting  the  specifications  of  this  fibre  into  equation 
1  implies  that  a  fan-in  of  approximately  28,000  is  achievable  before  any  intrinsic  fan-in 
losses  occur.  In  practice,  it  is  less  than  this  due  to  misalignments  in  the  fibre  arrays, 
but  is  still  much  greater  than  the  64  of  the  present  system.  The  fan-out  and  fan-in 
elements  are  such  that  each  input  can  address  all  the  output  fibres,  the  state  of  the  SLM 
determining  which  channels  are  open  at  any  given  time. 

3  Initial  results 

The  laser  diodes  and  photo-diodes  are  being  incorporated  into  transceiver  units  which 
will  enable  high  bandwidth  signals  to  be  sent  through  the  switch.  Initial  results  from 
the  transceiver  units  indicate  a  performance  of  500Mbits/s  at  a  Bit  Error  Rate  (BER) 
of  10“^°  at  -23dBm.  Table  1  indicates  the  current  losses  expected  from  the  64x64 
switch  when  fully  assembled.  With  the  6dBm  sources  to  be  used  in  the  system  we 
expect  to  achieve  these  data  rates  with  this  prototype  system.  The  liquid  crystal  SLM 
is  operating  in  the  book-stack  geometry,  providing  a  high  contrast  latching  device,  with 
reconfiguration  speeds  of  the  order  of  20/is.  Control  and  arbitration  of  the  switch  is 
performed  through  a  P.C.  interface  during  this  evaluation  stage,  however,  it  is  intended 
to  incorporate  this  functionality  on  the  silicon  backplane  of  the  SLM. 

Table  1:  64x64  crossbar  losses 


Component 

Losses 

transmission 

dB 

Fusion  splice  (laser  to  fibre  array) 

0.98 

-0.10 

Intrinsic  fan-out 

0.0156 

-18.06 

Fan-out  element 

0.75 

-1.25 

SLM 

0.70 

-1.55 

Reflection  losses 

0.79 

-1.00 

Fan-in  element 

0.50 

-3.01 

Coupling  into  output  fibre 

0.95 

-0.22 

Detector  Coupling 

0.96 

-0.20 

TOTAL 

0.00289 

-25.39 
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4  Conclusions 

The  development  of  free-space  optical  crossbar  switching  technology  has  been  achieved. 
The  64x64  matrix-matrix  single-stage  switch  described  here  is  designed  to  have  no  in¬ 
trinsic  fan-in  losses  and  therefore  is  scalable  to  larger  sizes  dependent  upon  the  inherent 
fan-out  losses.  The  performance  of  the  present  components  indicate  that  this  64x64 
switch  can  sustain  a  data  rate  of  500Mbits/s  at  a  BER  of  10“^°  and  will  be  extendible 
to  a  128x128  switch  at  2.5GBits/s.  The  single  stage  nature  of  the  switch  simplifies 
the  required  contention  and  arbitration  issues  which  become  a  major  factor  for  a  large 
switch. 
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A  compact  holographically  routed  optical  crossbar 
using  a  ferroelectric  liquid-crystal  over  silicon 
spatial  light  modulator 


D.C. O’Brien,  Douglas  J.  McKnight,  Optoelectronic  Computing 
Systems  centre,  Campus  Box  525,  University  of  Colorado,  Boul¬ 
der,  CO  80309.,  USA.  (tel  303  492  3330) 


Abstract.  A  demonstration  of  a  16  channel  holographic  crossbar  switch 
is- presented.  The  switch  uses  a  4x4  array  of  vertical  cavity  surface  emit¬ 
ting  lasers  (VCSEL’s)  as  individual  channel  inputs.  Light  from  each  laser 
illuminates  a  portion  of  a  ferrolectric  liquid  crystal  spatial  light  modula- 
tor(SLM),  which  displays  a  computer  generated  hologram(CGH)  in  a  binary 
phase  mode.  Holograms  are  designed  to  direct  the  appropriate  source  to  the 
desired  output  port  of  the  crossbar.  Simulation  and  experimental  results  are 
presented. 


1.  Introduction 

Free  space  interconnects  have  wide  application  in  optical  and  optoelectronic  computing 
and  information  processing.  Such  interconnects  are  routinely  implemented  using  binary 
and  multilevel  phase  CGH  techniques  (see  for  instance  [1])  and  excellent  results  are  re¬ 
ported.  However,  most  holographic  interconnects  are  fixed,  which  limits  their  flexibility. 
In  this  paper  a  reconfigurable  holographic  interconnect  structure  is  presented,  based  on 
the  crossbar  geometry  [2]. 


2.  Overview 

Figure  1  shows  a  schematic  of  the  crossbar  switch.  An  array  of  collimated  beams,  one  for 
each  input  port,  illuminates  an  SLM.  Each  input  illuminates  a  separate  Fourier  computer 
generated  binary  phase  subhologram,  which  is  designed  to  route  light  to  the  centre  of  the 
desired  output  detector.  The  Fourier  lens  provides  the  necessary  transform  and  fans-in 
light  from  all  the  input  channels.  For  a  16  channel  crossbar  there  are  16  base  holograms, 
each  of  which  routes  light  to  a  particular  output  port.  The  state  of  the  switch  is  set  by 
displaying  the  correct  hologram  in  front  of  the  correct  illuminating  beam. 


An  array  of  vertical  cavity  surface  emitting  lasers  (VCSELs)  is  collimated  using 
a  microlens  array.  The  array  of  beams  then  passes  through  a  beam  expander  (approx¬ 
imately  5X)  so  the  VCSEL  array  matches  the  hologram  pitch  on  the  SLM.  The  SLM 
displays  a  4x4  array  of  subholograms  in  a  binary  phase  mode.  Light  is  reflected  off  the 
SLM,  and  an  achromat  takes  a  Fourier  transform  to  create  a  hologram  replay  at  the  out¬ 
put  plane.  A  CCD  camera  is  placed  at  the  output  plane  of  the  switch.  Each  component 
is  described  more  fully  below; 

2A.  Sources 

The  crossbar  uses  a  4x4  array  of  VCSELs,  manufactured  by  PR1[3].  The  complete  array 
is  8x8  in  extent,  on  a  pitch  of  250  /im.  The  lasers  emit  a  close  to  Gaussian  beam  with 
a  numerical  aperture  of  ?^i0.1,  at  a  wavelength  of  ~840nm.  The  output  power  is  up  to 
2mW.  A  fused  silica  spacer  is  glued  to  the  surface  of  the  laser  array  and  a  microlens  is 
glued  to  the  spacer.  The  thickness  of  the  spacer  is  designed  so  that  the  sources  lie  at  the 
focus  of  the  microlenses,  creating  collimated  beams. 
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2.2.  Spatial  light  modulator  (SLM) 

The  SLM  used  in  the  experiment  is  a  256  x256  liquid  crystal  over  Silicon  SLM,  with  21.6 
pm  pixel  pitch,  and  a  total  aperture  of  5.53mm.  It  is  described  fully  in  [4].  The  device 
is  used  in  a  binary  phase  mode,  the  phase  states  being  obtained  thus:  Each  pixel  can 
be  considered  to  be  a  waveplate  which  can  be  electrically  switched  so  that  its  optic  axes 
rotate  through  an  angle  28  in  the  plane  of  the  pixel,  where  9  is  the  tilt  angle  of  the  liquid 
crystal.  The  input  light  to  the  device  is  linearly  polarised,  the  polarisation  oriented  so 
it  bisects  the  angle  the  optic  axes  of  the  liquid  crystal  switches  through,  and  the  output 
analyser  is  placed  orthogonal  to  the  polariser.  The  two  possible  output  states  are  of 
equal  amplitude  and  tt  radians  apart.  There  is  a  loss  penalty,  which  can  be  removed  if 
8  =  ~  (normally  ^  =  |  for  intensity  modulation)  and  the  pixel  is  a  half  waveplate.  In 
this  particular  case  the  phase  modulation  is  polarisation  independent. 

A  preliminary  experiment  was  undertaken  to  examine  the  phase  performance  of 
the  SLM.  A  series  of  horizontal  stripes  is  written  to  the  SLM  and  these  are  viewed  in 
a  phase  mode.  This  creates  a  grating  which  should  have  no  zero  order  component  in 
replay.  The  ratio  of  the  first  to  zero  orders  was  measured  to  be  35:1,  which  compares 
favourably  with  other  ferroelectric  devices  [5]  and  shows  the  device  has  a  robust  binary 
phase  mode. 

2.3.  Holograms 

The  subholograms  are  64x64  pixels  in  extent  and  are  designed  using  a  simulated  annealing 
technique.  They  are  designed  to  route  to  a  2x8  array  of  output  ports,  corresponding 
to  a  Silicon  photodetector  array  which  will  form  the  output  plane  in  future  crossbar 
demonstrations. 


3.  Experimental  Investigation 

Successful  operation  of  the  crossbar  requires  that  (i)  the  light  from  the  hologram  peaks 
are  sufficiently  uniform  and  that  (ii)  the  signal  to  noise  ratio  (SNR)  of  the  holograms 
is  large  enough  so  that  outputs  which  should  be  dark  do  not  receive  an  ’on’  level.  The 
worst  case  crosstalk  (defined  as  the  ratio  of  the  highest  received  dark  level  to  the  lowest 
received  on  level)  should  be  less  than  one  for  correct  operation. 

Each  base  hologram  was  displayed  on  the  SLM  and  the  replay  field  recorded.  A 
plot  of  the  hologram  spot  intensity  is  shown  in  Figure  2  along  with  the  measured  SNR, 
Together  these  can  be  used  to  simulate  the  received  power  at  each  detector  for  all  the 
potential  states  of  the  switch.  This  gives  a  worst  case  crosstalk  of  0.1  which  indicates 
the  crossbar  should  function  with  good  noise  margin. 

Figure  3  shows  an  image  of  the  SLM  and  the  output  plane  with  all  ports  illuminated. 
The  increased  zero  order  is  due  to  the  crystal  alignment  varying  over  the  surface  of 
the  SLM.  The  spot  pattern  should  form  a  regular  grid,  but  surface  curvature  of  the 
particular  SLM  used  causes  the  mismatch  at  the  replay  plane.  Compensation  for  this 
can  be  built  into  the  hologram  design  process.  The  SLM  transmits  about  0.4%  of  the 
light  an  ideal  device  would  pass  and  this  is  where  the  majority  of  the  losses  within  the 
system  lie.  However,  the  SLM  used  was  optimised  for  shorter  illumination  wavelengths. 
Using  a  device  optimised  for  this  wavelength  and  a  more  suitable  liquid  crystal  would 


190 


I® 

-X 

m 

%  0^ 

■  ';4v 

::#• 

# 

# 

Figure  3.  Image  of  SLM  Hologram  replay  (note  symmetrical  2x8  output 
arrays  with  zero  order  between) 


be  expected  to  improve  this  figure  considerably.  The  VCSELs  are  attractive  as  sources, 
as  they  are  compact,  easily  modulated  and  emit  relatively  high  power.  However,  they 
have  two  orthogonal  output  polarisations,  and  oscillate  between  them  unpredictably.  In 
the  experiment  a  waveplate  was  used  to  rotate  the  VCSEL  polarisation  so  that  the  SLM 
input  polariser  bisected  the  two  orthogonal  outputs,  so  that  all  sources  were  roughly 
equal  intensity.  More  suitable  solutions  would  be  to  optimise  the  SLM  for  polarisation 
independent  modulation,  or  to  control  the  VCSEL  polarisation  state. 


4.  Conclusions 

The  experiment  demonstrates  the  the  principles  and  some  of  the  attributes  of  this  type 
of  architecture.  It  has  no  intrinsic  fan-out  loss  and  alignment  is  straightforward.  It  also 
appears  possible  to  design  holograms  which  compensate  for  non-ideal  system  compo¬ 
nents.  SLMs  with  high  space  bandwidth  product  are  required  but  such  devices  continue 
to  become  available. 

Dynamic  holographic  routing  ha.s  many  advantages  and  a  modest  improvement 
in  performance  should  make  such  routing  structures  attractive  for  optical  processing 
applications. 
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Masamichi  Okamura  and  Noriyoshi  Yamauchi 

NTT  Interdisciplinary  Research  Laboratories 
3-9-11,  Midori-cho,  Musashino-shi,  Tokyo  180  Japan 
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Abstract.  Highly  efficient  optical  coupling  system  using  microlens  arrays  for 
free- space  optical  network  was  demonstrated.  High  speed  (<100  p.s)  and  high 
extinction  ratio  for  the  1.3  jxm  wavelength  were  obtained  by  a  proposed  ferro¬ 
electric  liquid  crystal  driving  method. 


1.  Introduction 


Two-dimensional  optical  switching  devices  consisting  of  liquid-crystal  cell  arrays  and 
birefringent  plates  have  many  advantages  in  making  large-scale,  transparent  optical  switch¬ 
ing  networks.  We  have  demonstrated  that  an  8-stage  optical  concentrator  using  1024-input- 
port  optical  beam  shifter  modules  operate  effectively  for  visible  light  [1,  2].  To  use  this  type 
of  module  in  an  optical  communication  network,  good  performance  for  long-wavelengths, 
highly  efficient  fiber-to-fiber  coupling  and  high-speed  switching  operation  are  required. 
This  paper  describes  a  new  optical  design  for  low-loss  interconnection  and  a  new  driving 
method  for  high-speed  ferroelectric  liquid  crystal  (FLC). 


Fig.l  N  X  N  free-space  optical  switching  networks[3]. 
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2.  Optical  design  for  fiber  interconnection 

Figure  1  shows  an  example  of  an  N  x  N  optical  switching  network  [3].  The  distance  be¬ 
tween  input  and  output  fiber  arrays  is  several  tens  of  centimeters,  because  the  system  con¬ 
sists  of  cascaded  optical  beam  shifter  modules.  Thus,  beam  expansion  due  to  diffraction  re¬ 
sults  in  very  large  increases  of  the  coupling  loss  and  limits  the  size  of  the  switching  network. 

Figure  2  shows  the  newly  proposed  configuration  of  lenses  for  low-loss  propagation. 
Figure  2(a)  shows  the  confocal  lens  system,  which  consists  of  a  very  short  focal  lens  and  a 
long  focal  lens,  for  obtaining  the  long  collimation  distance.  Figure  2(b)  shows  the  relay  lens 
with  long  focal  length  for  compensating  beam  expansion.  The  number  of  relay  lenses 
depends  on  the  size  of  the  switching  system.  This  relay  lens  can  compensate  the  beam 
expansion.  Figure  3(a)  shows  the  changes  in  beam  radius  along  the  propagation  direction  in 
Fig.  2(a)  using  a  ball  lens  (fj-0.58  mm)  and  sputter-liftoff  microlens  (f2=25  mm).  A  long 
collimation  distance  was  obtained  by  long  focal  length  microlens.  Figure  3(b)  shows  the 
compensation  of  beam  divergence  by  relay  lens  (f^-lSOmm).  The  radius  of  the  beam  from 
single  mode  fiber  with  collimation  lens  was  reduced  to  about  40%  at  the  position  of  4(X)mm. 
These  long  focal  length  microlens  were  fabricated  by  sputter  deposition  using  the  shadowing 
effect.  Figure  4  shows  a  photograph  of  the  sputter-liftoff  microlens  array  (32  x  32, 


(a)  Confocal  lens  system  (b)  Relay  lens  system 

Fig. 2  Microlens  configuration  for  long-span  beam  propagation. 


Fig. 3  Propagation  profiles  for  (a)  collimation  lens  system  and  (b)  relay  lens  system. 


Fig. 4  Photographs  of  sputter-liftoff  microlens  array  (32  x  32,  dia.=750}im). 
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Multi-mode 

fiber  array  bl  ML2  BS4 
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BP;  Birefringent  plate  Laser  source 

BS1,  BS2,  BS3,  BS4:  Optical  beam-shifter  (1.31pm) 

ML1,  ML2:  Microlens  array 
BL;  Ball  lens 

Fig. 5  Experimental  set-up  for  fiber-to-fiber  coupling  using  1  x  4  optical  switching  network. 

Figure  5  shows  the  experimental  configuration  of  a  polarization  independent  1x4  switch¬ 
ing  system.  On  the  input  side,  an  input  beam  is  separated  into  two  components,  an  ordinary- 
ray  and  an  extraordinary-ray.  These  beams  go  through  the  two  adjacent  cells  in  each  optical 
beam  shifter.  At  the  final  stage,  the  polarization  of  each  beam  is  controlled  and  the  beams  are 
combined  into  one  beam  with  non-polarization.  Optical  beam  shifter  modules  with  990  pm 
cell  size  and  350  pm  aperture  radius  are  used.  The  output  fiber  array  consists  of  4  x  4 
stacked  ceramic  ferrules  [4].  Two  long  focal  lenses,  MLl  (f=150  mm)  and  ML2  (f=25  mm), 
are  very  effective  for  highly  efficient  optical  coupling  between  an  input  fiber  and  an  output 
fiber  array.  Very  low  total  insertion-loss  of  8.7  dB  was  obtained  for  four  output  ports.  This 
value  proves  the  accurate  coupling  to  output  fibers  in  spite  of  the  very  long  propagation  span 
of  424  mm. 

3.  High  speed  optical  switch  for  1.3  pm  light  using  FLC 

Ferroelectric  liquid  crystal  is  a  very  attractive  material  for  high-speed  operation  ( two  to 
three  orders  faster  than  a  twisted-nematic  liquid  crystal).  There  are,  however,  two  main  prob¬ 
lems  to  solve  in  adapting  it  to  optical  switching  devices  for  long  wavelength.  To  obtain  the 
high  extinction  ratio,  the  cell  thickness  must  be  twice  as  thick  as  for  visible  light.  This  thick¬ 
er  cell  thickness  reduces  the  FLC  memory  effect.  The  ordinary  bipolar  driving  method  is  not 
adequate  for  optical  switching  because  of  the  short  break  of  the  optical  signal  by  the  reset 

pulse- 


(a)  Liquid-crystal  cell  circuit 


Transmittance 


(b)  Waveforms 


Fig. 6  New  driving  method  for  ferroelectric  liquid  crystals. 
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Fig. 7  Switching  waveforms  of  transmitted  light. 


Wavelength  ((im)  \^feve^ength  (^lm) 

Fig. 8  Dependence  of  extinction  ratio  and  insertion  loss  on  wavelength. 


To  overcome  these  problems,  we  developed  a  new  driving  method  for  the  FLC  optical 
switch.  An  active-matrix  driving  method  with  the  voltage  waveforms  shown  in  Fig.  6  was 
used.  The  unipolar  data  driving  pulse  can  prevent  the  short  signal  break.  The  low  voltage 
part  of  the  pulse  can  sustain  the  cell  data  without  loosing  reliability  and  the  high  voltage  part 
results  in  high  speed  switching  and  a  high  extinction  ratio.  Figure  7  shows  the  switching 
waveforms  of  transmitted  light  for  various  sustaining  voltages  The  high-speed  switching 
operation  (<100  ps)  was  confirmed.  This  value  is  two  orders  faster  than  for  a  twisted-nemat¬ 
ic  liquid  crystal.  As  shown  in  Fig.8,  high  extinction  ratio  (>30  dB)  and  low  insertion-loss  (<1 
dB)  were  obtained  at  the  1.3  pm  wavelength. 


4.  Conclusions 


We  proposed  new  microlens  system  configuration  for  highly  efficient  optical  coupling. 
Experiments  demonstrated  the  system’s  low  propagation  loss.  The  long-span  microlens 
system  was  shown  to  be  very  useful  for  large-scale  optical  switching  networks.  An  FLC 
driving  method  for  optical  switches  in  optical  communication  networks  was  proposed.  High¬ 
speed  operation  and  high  extinction  ratio  were  obtained  by  this  new  method.  This  optical 
switching  modules  are  applicable  to  large  scale  free-space  optical  switching  networks. 

We  would  like  to  thank  Dr.Tomoyuki  Toshima  for  his  encouragement  throughout  the 
course  of  this  work. 
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Abstract:  Liquid-crystal  microprism  arrays  are  shown  to  be  very  useful  for  electrically 
controlled  alignment  of  optical  beams  and  free-space  optical  interconnection.  They 
can  deflect  closely  spaced  micro-optical  beams  individually  to  any  position  with  a 
high  transmittance  of  95%  and  a  large  deflection  angle.  Various  optical 
interconnections  can  be  made  simply  by  changing  the  voltage  applied  to  each 
microprism  . 

1.  Introduction 

One  problem  with  free-space  interconnection  is  the  complicated  mechanically  controlled 
alignment.  Using  many  mechanical  micropositioners  for  alignment  within  several  micrometers 
is  hard  work  and  increases  the  module  size.  Furthermore,  for  optical  interconnections  among 
two-dimensional  (2D)  optical  switches,  various  optical  interconnections  such  as  crossing, 
parallel,  and  concentration  are  necessary.  Hologram  cells,  prisms,  and  lenses  provide  fixed 
optical  interconnections.  Thus  for  techniques  electrically  controlled  alignment  employing 
optical  beams  and  deflectors  able  to  deflect  in  arbitrary  directions  are  desired  for  optical 
interconnection.  Various  kinds  of  optical  beam  deflecting  device,  such  as  holographic 
diffraction  cells [1-3],  have  been  developed  for  free-space  optical  interconnection.  However, 
they  have  drawbacks  such  as  large  loss,  low  steering  angle,  undesired  high-order  diffraction 
peaks,  and  a  limited  wavelength  range.  Sato  et  al.  [4]  explained  that  LC  Fresnel  lenses 
change  not  only  the  focal  length  but  also  the  propagation  direction  of  the  optical  beams.  We 
started  with  these  devices  and  improved  their  performance  for  free-space  optical 
interconnection.  We  call  them  LC  microprism  arrays.  This  paper  investigates  the  application  of 
these  arrays  to  optical  interconnection  for  electrically  controlled  alignment  of  optical  beams  and 
various  interconnections. 

2.  Device  structure  and  principle  of  beam  deflection 

The  structure  of  the  LC  microprism  array  is  shown  in  Fig.  1.  This  is  a  homogeneously 
aligned  nematic  LC  cell,  in  which  microprisms  with  a  pitch  of  250  ^im  are  fabricated  on  the 
bottom  transparent  glass  plate. 

When  no  voltage  is  applied,  the  LC  molecules  align  parallel  to  the  glass  plate.  The  input 
optical  beam  is  refracted  at  the  prism  plane  according  to  Snell's  law.  When  voltage  is  applied, 
the  LC  molecules  are  realigned,  and  the  refractive  index  decreases  from  Hq  to  no-  The 
deflection  angle  thus  changes  with  applied  voltage.  The  deflection  angle  is  determined  by  the 
refractive  indices  of  the  LC  and  the  transparent  plate,  and  by  the  apex  angle.  A  larger  deflection 
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angle  can  be  obtained  by  using  an  LC  with  larger  refractive  anisotropy.  We  used  a  Merck  BL- 
009  LC  because  it  has  the  largest  refractive  anisotropy:  1.8181  (ne)  and  1.5266  (no). 

3.  Experiments  and  Results. 

We  prepared  LC  microprism  arrays  with  apex  angles  of  10°,  20°,  30°,  and  40°.  The 
striped  ITO  electrodes  of  the  LC  microprism  were  connected  in  parallel  by  a  flexible  cable  to 
variable  resistors  (5  kQ).  The  applied  voltages  are  controlled  by  these  variable  resistors. 

First,  to  clarify  whether  LC  microprism  arrays  are  feasible  for  optical  interconnection, 
we  measured  their  basic  performance  using  a  single  microbeam  with  a  beam  waist  of  about  100 
pm.  The  beam  profiles  of  the  deflected  beams  are  shown  in  Fig.  2  for  the  apex  angle  of  30°. 
The  applied  voltage  dependence  of  the  deflection  angles  is  shown  in  Fig.  3  as  a  parameter  of 
the  apex  angle.  The  maximum  deflection  angle  was  15°.  The  optical  beams  were  deflected 
continuously  by  the  LC  microprism  up  to  2.8  V.  Transmittance  was  no  less  than  95%. 
However,  above  2.8  V,  the  optical  beams  were  deformed  and  split  into  two  peaks. 
Furthermore,  when  the  apex  angle  was  more  than  40°,  the  optical  beams  were  deformed  even 
with  no  applied  voltage.  This  is  because  the  LC  molecule  alignments  are  not  homogenous. 
Thus,  for  optical  interconnections,  the  apex  angle  should  be  smaller  than  30°  and  the  applied 
voltage  should  be  lower  than  2.8  V.  Thus  available  deflection  angle  was  8°.  The  response  on- 
time  was  about  1-2  seconds,  independent  of  the  deflection  angle  and  the  apex  angle.  The 
response  off-time  was  very  slow  (10  seconds  -40  seconds),  and  increased  with  the  apex  angle. 

In  applying  the  LC  microprism  arrays  to  optical  interconnections  among  2D  switches,  as 
shown  in  Fig.  4(a),  we  demonstrate  deflections  of  optical  beams  from  a  fiber  array  with  the 
collimating  microlens  arrays  shown  in  Fig.  4(b).  An  8  x  8  fiber  array  with  a  250-p.m  pitch  was 
used;  it  had  a  planar  microlens  array  with  the  same  pitch  as  that  of  the  fiber  array  for  the 
collimating  light  beams.  The  apex  angle  of  the  LC  microprism  array  we  used  was  20°.  The 
propagation  directions  of  the  beams  from  the  fiber  array  with  the  collimating  lens  were  not 
perfectly  the  same,  but  a  little  random  even  if  the  LC  microprism  array  was  not  used.  This  is 
because  the  fiber  arrays  were  assembled  with  a  slight  positioning  error  (a  few  micrometers). 
By  changing  the  resistors  connected  to  the  LC  microprism,  and  monitoring  the  spot  positions 
with  the  CCD  camera,  the  propagating  direction  of  each  beam  could  easily  be  controlled,  and 
any  optical  interconnection  was  possible,  as  shown  in  Fig.  5.  Next,  we  demonstrate  various 
optical  interconnections  using  the  crossed  two-LC  microprism  arrays,  which  can  deflect  optical 
beams  in  two  directions  X  and  Y.  The  outputs  of  the  CCD  camera  are  shown  in  Fig.  6.  Each  of 
eight  optical  beams  from  the  fiber  array  could  be  deflected  in  arbitrary  directions. 

Finally,  we  show  the  concept  of  a  2D  optical-switch  network  using  the  LC  microprism 
arrays  in  Fig.  7.  It  consists  of  2D  optical  switches,  microlens  arrays,  LC  microlens  arrays, 
and  an  LC  macrolens.  They  are  cemented  together  and  mounted  in  square  holders,  then 
mounted  on  an  L-shaped  block.  Optical  interconnections  between  the  optical  switches  are 
achieved  by  tuning  the  variable  resistors  connected  to  the  LC  microprism  arrays.  Complicated, 
precise,  mechanically  controlled  alignment  is  not  necessary  to  arrange  the  optical  components 
because  the  output  beam  directions  can  be  controlled  electrically  with  precise  positioning.  This 
module  is  very  compact  because  it  does  not  use  macrolenses  or  diffract  devices. 
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4.  Summary 

LC  microprism  arrays  were  shown  to  be  very  useful  for  free-space  optical 
interconnection  and  electrically  controlled  alignment  of  optical  beams:  these  arrays  can  deflect 
closely  spaced  optical  beams  individually  to  any  point  with  high  transmittance  (95%),  at  a  high 
deflection  angle,  and  a  low  voltage  (<  2.8  Vrms).  Various  optical  networks  can  be  constructed 
by  simply  changing  the  voltage  applied  to  each  microprism  array. 
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Fig.  3  Deflection  angle  versus  applied  voltage 


Fig.  2  Profiles  of  deflected  beams 


Fig.  5  Optical  interconnections  using  a  single-LC 
microprism  array 


Fig.  7  Concept  of  a  2D  optical  switch 
network  using  LC  microprism  arrays. 
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Abstract 

First  experimental  results  on  a  64  channel  free-space  photonic  swtching  system  are  presented. 
Two  control  schemes  are  demonstrated  ;  direct  optical  addressing  with  potential  signal 
amplification  and  self-routing  operation  acting  on  data  packets.  A  brief  study  on  uniformity 
and  noise  effects  is  given,  which  are  found  to  be  the  major  limiting  factors  of  the  system. 

1.  Introduction 

We  present  a  1  to  64  ffee-space  optical  switch  which  can  be  operated  both  in  packet 
switching  and  circuit  switching  environments.  The  system  associates  a  spatial  light 
modulator  (0  (SLM)  and  an  optical  bistable  device  array,  both  based  on  GaAs/GaAlAs 
Multiple  Quantum  Well  (MQW)  structures.  The  SLM  is  an  8x8  array  of  individually 
addressable  transmission  type  electro-absorption  modulators.  The  second  active  element,  the 
optical  bistable  device  consists  of  a  non-linear  Fabry-Perot  etalon.  To  obtain  an  8x8  array 
of  bistable  devices,  pixels  are  simply  defined  by  separated  incident  light  spots  on  the  same 
cavity.  When  maintained  in  the  bistable  region  of  operation  by  a  holding  beam  array,  the 
devices  exhibit  a  two  state  memory  effect  for  about  1  ms,  illustrated  by  points  A  and  B  in 
figure  1(a).  With  a  slight  modification  of  the  detuning  from  the  Fabry-Perot  resonance,  this 
hysteresis  loop  can  be  transformed  into  a  simple  thresholding  curve  (fig.  1(b)). 
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Fig.  1.  :  Non-linear  Fabry-Perot  device  response  curves,  a);  bistable,  b):  simple  thresholding  operation 

2.  Packet  switching  operation 

In  packet  switching  mode  self-routing  commutation  is  implemented  by  optical  decoding  of 
the  header  address  preceding  each  data  packet.  This  address  decoding  0)  consists  of  fanning- 
out  the  signal  in  64  channels,  comparing  bit  by  bit  the  address  of  the  packet  to  the  address  of 
each  output  channel,  and  cutting  all  channels  where  the  identification  of  one  or  more  bits 
fails.  The  following  two  bit  coding  technique  is  used  to  encode  the  binary  addresses  ;  each  bit 
is  followed  by  its  complementary  value,  so  bit  one  is  encoded  by  a  high-low  sequence,  bit 
zero  by  a  low-high  sequence.  By  the  use  of  this  code,  address  comparison  becomes  a  simple 
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bitwise  multiplication  of  the  packet  address  by  the  inversely  coded  channel  address, 
introduced  as  the  modulated  transmission  of  the  SLM  pixel.  Any  mismatch  between  the  two 
addresses  is  turned  into  the  transmission  of  a  high  level  address  pulse  by  the  modulator 
element.  This  pulse  then  switches  down  the  corresponding  bistable  element  from  its  initial 
reflective  state  (point  A)  into  its  blocking  state  (point  B  on  fig  1/a).  Consequently,  the  only 
bistable  element  staying  in  the  reflective  state  will  be  that  of  the  destination  channel,  where 
address  matching  is  perfect.  After  the  data  packet  has  passed  through  this  channel,  all  bistable 
devices  are  reset  to  the  initial  high  reflection  state  to  receive  the  next  packet. 

Figure  2  shows  output  signals  obtained  by  single  channel  operation  of  the  system. 
The  first  oscilloscope  trace  shows  a  packet  addressed  to  the  observed  output  ;  after  the  initial 
synchronizing  pulse,  there  is  no  high  level  pulse  due  to  address  mismatch,  so  data  are 
transmitted  by  the  system.  The  last  three  traces  represent  packets  with  different  addresses 
which  do  not  match  the  address  of  this  output.  We  can  distinguish  the  high  level  address 
pulse  which  immediately  switches  off  the  bistable  device.  The  bistable  devices  used  in  the 
experiment  have  a  contrast  ratio  of  2  between  high  and  low  reflection  states,  so  theoretically 
blocked  packets  arrive  at  the  outputs  with  non-negligible  intensities.  The  total  input  power  to 
the  set-up  is  10  mW,  giving  a  peak  intensity  of  3.5  mW  on  the  bistable  element.  The  clock 
frequency  of  the  system  is  fixed  at  20  MHz  due  to  our  experimental  means  (device  response 
times  being  of  the  order  of  ns).  The  operation  of  the  system  at  address  bit  periods 
considerably  shorter  than  50  ns  would  require  higher  intensity  levels.  The  data  bit  rate  can  be 
increased  independently  of  device  limitations  because  the  system  is  entirely  transparent  to 
data.  System  operation  is  demonstrated  for  packet  lengths  up  to  8  kbits  with  a  typical  address 
recognition  rate  exceeding  99  %.  The  system  stability  is  limited  by  parasitic  commutations 

provoked  by  amplitude  fluctuations  of  the  input  laser  beam. 

We  have  also  demonstrated  simultaneous  operation  of  three  neighbouring  channels  of 
the  set-up.  The  stability  in  this  experiment  was  quite  poor,  correct  operation  could  not  be 
obtained  for  data  string  lengths  exceeding  400  bits.  As  we  show  in  section  4,  the  number  of 
simultaneously  operating  channels  and  the  error  rate  of  address  decoding  is  mainly  limited  by 
the  non-uniformity  of  active  and  passive  devices  and  the  amplitude  fluctuations  arising  from 
the  laser  source. 


Fig.  2  ;  Output  signals  of  the  self-routing  operation 


Fig.  3  :  Output  obtained  b>'  direct  addressing 


3.  Switching  using  direct  optical  addressing 

In  circuit  switching  mode  holding  beams  are  directly  used  to  control  bistable  device 
reflectivity,  by  setting  the  holding  beam  intensity  at  a  lower  or  higher  level  than  the  switching 
threshold,  using  the  non-bistable  operation  mode  of  the  nonlinear  Fabry-Perot  cavity  (fig. 

1/b).  ’  •  XU  • 

Figure  3  shows  one  channel  operation  of  this  scheme:  when  the  control  (or  holding)  beam  is 
off  (point  A  of  figure  1/b),  the  bistable  device  is  reflective,  so  the  signal  is  routed  to  the 
output ;  when  the  control  beam  is  on,  the  device  is  in  the  low  reflection  state,  so  the  signal 
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beam  is  mostly  absorbed.  In  this  operation  mode  the  contrast  ratio  can  be  improved  up  to  a 
value  of  4,  while  the  total  input  intensity  can  be  as  low  as  3  mW.  It  means  that  switching 
threshold  is  about  1  mW,  versus  3.5  mW  in  the  bistable  mode.  In  this  operation  scheme  the 
control  beam  intensities  (points  A  and  B  on  Fig.  1/b)  can  be  placed  relatively  far  from  the 
switching  threshold.  This  results  in  an  increased  robustness  against  noises  and  non¬ 
uniformities  if  the  signal  beam  is  weak  compared  to  the  control  power.  Our  experimental 
results  confirm  this  expectation;  37  channels  out  of  64  could  be  operated  simultaneously, 
without  any  observable  noise-induced  instability. 

This  operation  mode  can  be  transformed  into  active  switching,  by  setting  the  control 
beam  intensity  to  point  C  of  figure  1/b  (i.e.  very  close  to  the  switching  threshold  ).  If  the 
additional  signal  intensity  exceeds  the  threshold,  the  device  is  switched  by  each  data  bit 
change,  so  modulation  of  the  signal  is  transmitted  onto  the  control  beam.  As  the  control  beam 
is  in  general  more  intense  than  the  signal  beam,  data  amplification  can  be  obtained.  The  actual 
gain  depends  on  the  device  response  curve  and  the  signal  to  noise  ratio.  In  our  system  an 
amplification  factor  of  nearly  two  was  obtained,  but  with  other  bistable  devices  higher  gains 
have  also  been  demonstratedC"^).  This  operation  scheme  presents  the  advantages  of  data 
reshaping  and  amplification,  but  it  requires  data-rate  switching  of  the  bistable  element, 
limiting  the  data  rate  by  its  operation  speed.  Rise-  and  fall-times  of  the  order  of  10  ns  were 
observed  for  our  actual  devices  at  1  mW  threshold  intensities. 


4.  Tolerances  to  noises  and  non-uniformities 

For  the  simultaneous  and  stable  operation  of  the  channels  a  certain  number  of  working 
conditions  have  to  be  fulfilled  for  all  channels  and  during  the  whole  operation.  In  packet 
switching  mode  there  are  three  criteria:  the  sum  of  the  holding  beam  intensity  and  the  low 
level  signal  beam  intensity  must  be  inferior  to  the  higher  switching  threshold  [1],  the  holding 
beam  with  the  signal  beam  at  high  logic  level  have  to  exceed  it  [2],  while  the  single  holding 
beam  has  to  exceed  the  lower  threshold  [3],  These  conditions  can  be  written  in  the  following 
form: 


“•loon., +a-a  l,ij„  <y 

[1] 

T  I  C 

contr  sign  >  .  P 

a  a-a 
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U-C,  I,n-P 
a  L 
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Where  ,  I^jg^  and  I^j,  are  respectively  the  mean  values  of  the  holding  beam,  the  low  level 
signal  beam  and  the  higher  threshold  intensities.  The  contrast  ratio  C  and  the  hysteresis  width 
L  is  defined  such  that  I,ig„  -C  is  the  mean  high  level  signal  intensity  and  Ijj^/L  is  the  lower 
threshold.  The  Greek  letters  (o,  a,  |3)  represent  the  magnitude  of  the  spatial  and  temporal 
variations  of  these  intensities,  with  the  common  definition  that  all  changes  of  a  parameter  I 
with  a  variation  y  have  to  lie  within  the  interval  [I/y,  ly],  where  ye(o,  a,  (3).  So  the  mean 
intensities  defined  previously  are  simply  the  geometrical  means  between  the  maximal  and 
minimal  values  of  the  given  fluctuating  parameter  set.  Variations  arising  from  the  bistable 
devices,  the  modulator  array  and  the  passive  components  are  described  by  p,  a  and  a 
respectively.  Introducing  the  ratios  /  Ig,g„  and  T=Ijj,/I^^„j^  and  finding  the  optimal 

setting  for  T  and  G,  we  obtain  the  following  relationship  between  these  parameters  ; 
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Similar  expressions  were  derived  for  the  two  other  switching  schemes.  Figure  4  plots  the 
tolerance  regions  for  the  three  schemes.  To  meet  all  these  tolerance  limits  a-p<1.05  and 
g<1.07  are  needed  which  means  that  all  spatial  and  temporal  changes  must  be  within  ±7  % 
for  the  SLM  transmission,  and  typically  ±2.5  %  for  the  bistable  device  thresholds  and  the 
passive  components.  We  to  note  that  the  so  called  passive  device  variations  include  the 
amplitude  fluctuations  arising  from  the  laser  as  well  as  all  non-uniformities  and  noises  arriving 
from  devices  others  than  the  two  key  elements.  The  performance  of  the  devices  used  in  this 
first  series  of  experiments  fall  completely  outside  of  the  tolerance  regions,  so  they  do  not 
allow  the  simultaneous  operation  of  all  64  channels.  Non-uniformity  compensation  by 
individual  tuning  of  the  holding  beam  intensities  can  be  envisioned. 


a*p 


a 

Fig.  4:  Tolerance  regions  for  the  three  control  modes  (shaded  areas  meet  the  requirements  of  parallel  operation) 


5.  Conclusion 

In  conclusion,  first  experimental  results  on  the  operation  of  the  1  to  64  free-space 
optoelectronic  switch  was  presented  both  in  packet  switching  and  in  circuit  switching  mode. 
Uniformity  and  noise  effects  are  found  to  be  the  major  limitations  of  the  system  operation, 
hence  an  improvement  of  device  characteristics  and  non-uniformity  compensation  are  needed. 
Another  challenging  study  will  be  the  miniaturisation  of  the  system. 

This  work  is  supported  by  the  French  Ministry  of  Research  under  the  project 
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Parallel  Processing  Architectures  with  Dynamic  Optical 
Interconnections  using  Spatial  Light  Modulators 
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Abstract.  We  discuss  the  advantages  of  dynamic  optical  interconnections  in  the  context  of 
parallel  processing.  After  describing  algorithms  where  reconfiguration  is  useful,  we  describe  an 
optical  system  using  a  nematic  liquid  crystal  spatial  light  modulator  acting  in  a  phase-only 
modulation  mode.  Binary  phase  structures  displayed  on  the  device  provide  programmable 
interconnection  and  fan-out  topologies. 


1.  Introduction 

The  increasingly  important  roles  of  optical  interconnections  in  communication  and  high  speed 
processing  systems  is  due  to  the  many  advantages  of  optical  connections  over  their  electronic 
(metallic)  counterparts.  These  include  the  high  degree  of  three-dimensional  parallel 
connectivity,  the  ability  of  beams  to  cross  in  free-space  without  crosstalk,  and  the  high 
bandwidth  of  optical  communication  channels. 

A  lens  system  can  perform  a  simple  1:1  interconnection  between  planes  of  processing 
elements.  More  complex  interconnection  topologies  can  be  performed  by  holographic 
elements,  microlens  arrays,  mirrors  and  prisms.  However,  unless  some  mechanical  system  is 
employed  the  interconnection  topology  is  fixed.  In  this  paper  we  discuss  some  of  the  cost  and 
performance  benefits  which  may  be  achieved  if  the  interconnection  system  is  reconfigurable, 

1. e.  when  the  interconnection  topology  can  be  dynamically  programmed  under  computer 
control.  Various  devices  have  been  utilised  for  such  systems[l  ][2  ][3  ].  Here  we  describe  a 
reconfigurable  optical  interconnection  system  based  on  a  nematic  liquid  crystal  spatial  light 
modulator  (LCSLM)  which  operates  in  a  phase-only  modulation  mode.  This  device  has  been 
chosen  for  its  high  resolution,  low  cost,  and  ease  of  availability.  By  displaying  binary-level 
phase  structures  on  the  LCSLM  we  can  implement  a  variety  of  interconnection  and  fan-out 
topologies  which  can  be  dynamically  reprogrammed. 

2.  Role  of  Reconfigurable  Optics 

Reconfigurable  interconnections  have  several  advantages  over  fixed  interconnection  stages.  In 
particular  they  allow  the  maximisation  of  the  performance  gains  of  non-local  optical 
interconnections  over  a  range  of  algorithms  inherent  in  practical  image  processing  tasks.  They 
may  also  be  used  to  provide  acceleration  of  individual  algorithms  by  changing  the  processor 
interconnection  harness  during  run-time  of  a  program. 

We  have  investigated  how  reconfigurable  interconnects  may  be  employed  over  a  small 
range  of  tasks  including  image  correlation,  noise  removal  and  enhancement,  FFTs  and  sorting. 
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In  this  study  we  have  used 
such  interconnects  as  we 
have  experimentally 

ccMifigured,  including  nearest 
neighbour,  next-nearest 
neighbour,  image  dilation 
and  various  shuffle 
connections.  In  particular  we 
have  built  a  set  of  optimised 
tasks  which  may  be  expected 
to  run  on  our  own 
demonstrator  hardware 
containing  a  reconfigurable 
plane.  This  enables  us  to  generalise  the  functionality  of  our  optical  processing  modules  while 
maintaining  the  performance  advantages  of  the  optics,  thus  increasing  the  overall  perfcMrmance 
of  the  system.  Numerical  results  obtained  by  simulation  of  our  reconfigurable  architecture  on 
a  distributed  array  processor  (DAP),  allow  comparison  with  non-reconfigiuablc  ccHnputing 
schemes. 

For  example,  Figure  1  shows  schematically  the  data  flow  connections  for  a  standard 
radix-2  Cooley-Tukey  FFT  algcmthm.  Consider  only  the  interconnection  stages  labelled  1,2, 
and  3  which  may  be  ccmnecting  arrays  of  elecfronic,  optical  or  hybrid  processing  elements. 
Each  stage  requires  a  fan-out  of  two:  one  path  undeviated  and  the  other  deflected  by  a 
distance  depending  on  the  depth  of  the  stage.  However,  in  stage  1  the  top  four  data  paths 
require  a  different  connection  geometry  to  the  bottom  four  paths.  Nevertheless  it  is  possible 
to  implement  these  connections  in  a  space-invariant  optical  system  by  performing  a  fan-out  of 
three  and  providing  a  masking  function  (cither  electronic  or  physical)  at  the  following 
processing  stage.  So  for  N=8  inputs  the  three  fixed  interconnection  stages  could  be  replaced 
by  a  single  reconfigurable  stage.  Hie  hardware  saving  is  of  the  order  of  loga  N. 


Figure  1.  Radix-2  Cooley-Tukey  FFT  algorithm  (N=8  inputs).  A  single 
r^onfigurable  stage  could  replace  ihe  fixed  int^connecdon  stages. 


3.  Demonstration 

The  liquid  crystal  (LC)  panel  used  here  is  one  of  three  panels  obtained  from  a  televirion 
projection  system  manufactured  by  Seiko-Epson  (VPJ2000).  This  panel  has  a  larger  number 
and  smaller  pixel  size  than  that  used  in  any  previously  reported  experiments.  The  panel  is  a 
twisted  nematic  device  designed  for  intensity  modulation  when  placed  between  crossed 
polarisers.  The  plastic  polarisers  have  been  replaced  with  high  quality  polarisers  in  rotation 
stages.  The  panel  has  physical  dimensions  of  26.9  x  20.2  iran^  and  contains  480  x  440  pixels. 
The  pixel  active  area  is  31  x  31  |im^  with  a  centre-to-centre  spacing  of  56  pm  and  46  pm  in 
the  horizontal  and  vertical  directions  respectively.  The  thickness  of  the  LC  layo*  is  4.5 
±  0.5  pm  and  its  birefringence  is  0.091.  The  panel  uses  active  matrix  addressing  and  has  a 
polycrystalline  Si  thin  film  transistor  (TFT)  circuit  at  each  pixel.  We  have  interfaced  the 
projector  to  an  IBM  compatible  computer  for  displaying  arbitrary  patterns  on  the  panel. 

Due  to  addressing  lines  between  pixels,  the  effective  clear  aperture  of  the  panel  is  37%. 
With  Fresnel  reflection  losses  at  the  air/glass/liquid  crystal  interfaces  and  absorption  losses  in 
the  LC  layer  we  have  measured  an  absolute  transmission  of  28%.  Another  source  of  loss  is 
diffraction  into  higher  orders  due  to  the  diffraction  grating  behaviour  of  the  pixel  structure 
when  illuminated  by  coherent  light.  After  spatial  filtering,  the  proportion  of  incident  power 
left  in  the  zero  order  was  approximately  10%.  To  overcome  these  losses  we  have  investigated 
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the  use  of  a  microlens  array  (MLA)  to  focus  the  incident  light  through  the  pixel  windows.  An 
eight-level  diffractive  MLA  attached  to  one  of  the  panels  increases  the  transmission  of  the 
panel  from  28%  to  51%.  Even  better  performance  is  expected  in  future  designs.  The 
interconnection  results  presented  here  were  obtained  using  an  unmodified  panel. 

Several  techniques  have  been  described  to  obtain  phase-only  modulation  from  twisted 
nematic  (TN)  panels.  These  usually  involve  accurate  setting  of  the  bias  voltage  [4]  or 
operating  the  panel  in  a  double  pass  reflection  configuration  [5  ].  Unfortunately,  these 
techniques  are  not  suitable  in  our  case  and  an  alternative  technique  is  necessary.  The  simple 
view  of  the  operation  TN  panels  where  the  polarisation  of  the  incident  light  “follows”  the 
twist  of  the  LC  molecules  is  only  true  in  the  case  of  thick  LC  layers.  In  general  the  operation 
is  more  complex  and  a  full  understanding  requires  detailed  analysis  of  the  panels  by  Jones 
calculus.  Such  analysis  allows  the  derivation  of  an  expression  for  the  phase  delay  of  the 
exiting  light  which  depends  on  the  angles  of  the  polariser  and  analyser  as  well  as  the  voltage 
applied  across  the  panel.  It  is  possible  to  find  particular  orientations  of  the  polariser  and 
analyser  for  which  the  panel  acts  as  a  phase  modulator  while  introducing  negligible  intensity 
modulation  [6  ].  It  is  in  this  configuration  that  we  operate  our  panel. 

One  of  the  simplest  interconnection  patterns  to  implement  is  the  fan-out  of  a  single 
beam  to  two  identical  diffracted  orders  (1x2).  This  is  achieved  by  displaying  a  binary  stripe 
grating  where  the  phase  depth  is  alternately  0  and  n  radians.  The  interconnection  distance  can 
be  varied  by  altering  the  grating  period.  This  phase  structure  has  a  theoretical  diffraction 
efficiency  of  8/7C^  (81%)  into  the  +1  and  -1  orders.  Table  I  summarises  the  theoretical  and 
experimental  performance  of  the  interconnection  patterns  described  in  this  paper.  If  the 
grating  phase  depth  is  less  than  n  radians  then  some  power  is  left  in  the  undiffracted  zero 
order.  By  correctly  choosing  the  phase  depth  the  power  in  the  +1,  0,  and  -1  orders  can  be 
equalised  which  provides  a  fan-out  to  three  (1x3)  interconnect.  This  is  the  interconnect 
pattern  required  for  an  optical  implementation  of  the  FFT  algorithm.  The  necessary  phase 
depth  was  calculated  to  be  0.647t  and  can  be  obtained  by  adjusting  the  voltage  signal  applied 
to  the  pixel. 

A  commonly  used  interconnection  in  image  processing  functions  is  the  neighbourhood 
interconnection  which  connects  a  pixel  to  its  neighbours  either  by  (i)  fanning  out  to  nearest 
neighbours  (NN)  in  the  up,  down,  left  and  right  directions,  or  (ii)  by  fanning  out  to  the  next- 
nearest-neighbours  (NNN)  in  the  diagonal  directions.  The  phase  structure  required  for  the 
NN  interconnection  is  obtained  by  crossing  the  phase  structure  for  the  1x2  fan-out  described 
above.  This  produces  a  checkerboard  phase  pattern  which,  when  rotated  by  45°  and  adjusting 
the  grating  period,  provides  the  NNN  interconnection.  These  structures  have  a  theoretical 
diffraction  efficiency  of  (8/7t^)^.  Measured  efficiencies  agree  well  with  theoretical  predictions. 

Dammann  gratings  are 
commonly  used  to  generate  arrays 
of  equally  intense  spots. 

However,  higher  diffraction 
efficiencies  can  be  achieved  by 
nonseparable  phase  structures 
where  the  phase  transition  points 
are  independently  optimised  in  the 
horizontal  and  vertical 
dimensions.  We  have  generated 
trapezoidal  stripe-geometry 
binary  phase  structures  [7  ]  to 


Table  1.  Theoretical  and  experimental  performance  of  various 
interconnection  patterns.  (NN=nearest  neighbour.  NNN=next 
nearest  neighbour).  Uniformity  error  is  expressed  as  ratio  of 
standard  deviation  to  average  power. 


Interconnect 

Pattern 

Theoretical 

Diffraction 

Efficiency 

Measured 

Diffraction 

Efficiency 

Measured 

Uniformity 

Error 

1x2 

81% 

»80% 

6% 

1x3 

86% 

«81% 

7.4% 

1x4  NN 

65.7% 

-60% 

9.5% 

1x4  NNN 

65.7% 

-60% 

7.2% 

4x4 

77% 

-69% 

10.7% 

8x8 

72% 

-72% 

32.9% 
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Figure  2.  (a)  Binary  phase  structure  for  non-separable  trapezoidal  stripe-geometry  design  to  generate  a 
4x4  spot  array,  and  (b)  the  spot  array  at  the  Fourier  plane  generated  by  this  structure.  (X=632.8nm) 

generate  a  variety  of  spot  arrays  including  4x4,  4x8,  2x4,  and  8x8.  These  phase  structures 
have  theoretical  diffraction  efficiencies  of  typically  >70%  and  the  measured  efficiencies  agree 
well  with  predictions  (Table  I).  Figure  2  shows  an  example  array  and  the  phase  structure 
which  generated  it. 

4.  Conclusions 

We  have  described  the  advantages  of  a  reconfigurable  optical  interconnection  system  in  the 
context  of  parallel  optical  computing.  Using  a  commercially  available  nematic  liquid  crystal 
television  projection  panel,  operating  in  a  phase-only  modulation  mode,  we  demonstrated  a 
variety  of  dynamic  interconnection  topologies  with  a  reconfiguration  time  of  20-30  ms.  These 
include  simple  fan-out  in  one  dimension,  nearest-neighbour  and  next-nearest-neighbour 
interconnections,  non-local  interconnections  and  array  generation.  Measured  diffraction 
efficiencies  are  close  to  theoretical  predictions.  The  relatively  large  uniformity  errors  on  some 
of  the  arrays  are  due  to  the  fact  that  we  are  displaying  pixellated  versions  of  the  original  phase 
structure  designs.  Improved  performance  would  be  achieved  in  future  by  using  \nghcT 
resolution  panels.  The  next  generation  of  LC  panels  will  have  pixel  sizes  of  the  order  of 
10  pm  and  an  order  of  magnitude  increase  in  the  number  of  pixels.  Although  the  panels  used 
here  are  twisted  devices  designed  for  intensity  modulation,  we  have  described  a  technique  of 
obtaining  phase-only  modulation.  However,  the  phase  structures  are  limited  to  binary  levels 
of  0  and  n  radians.  If  untwisted  (parallel  aligned)  devices  were  more  readily  available  then 
multilevel  phase  modulation  would  be  possible  which  would  allow  the  implementation  of 
kinoform  phase  structures  which  have  much  higher  efficiencies  and  would  not  require  the 
polariser  and  analyser  components. 
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Abstract 

Two  approaches  to  optical  interconnect  using  pixellated,  electronically  addressed  spatial 
light  modulators  as  reconfigurable  optical  elements  are  presented:  one  analytical  described 
using  a  low  resolution  device;  the  other  using  computer  generated  holograms  and  a  state-of- 
the-art  planarised  silicon  backplane  device. 


1  Introduction 

Reconfigurable  optical  interconnection  has  been  identified  as  a  crucial  function  in  optical  comput¬ 
ing.  One  of  the  more  attractive  techniques  involves  the  use  of  an  electronically  addressed  spatial 
light  modulator  (EASLM),  in  a  Fourier  transform  optical  system  to  give  holographic  routing.  For 
example,  the  optical  information  to  be  routed  can  be  input  to  a  4-f  coherent  optical  processor, 
witl  the  EASLM  placed  at  the  Fourier  plane  and  the  information  destination  is  at  the  output 
plane.  By  the  correct  selection  of  the  pattern  on  the  EASLM,  the  information  can  be  routed  to 
the  required  regions  in  the  output  plane.  The  patterns  on  the  EASLM  are  generally  gratings, 
Dammann  gratings  or  computer  generated  holograms  (CGH).  A  very  attractive  generic  type  of 
EASLM  is  the  Ferroelectric  Liquid  Crystal  over  Very  Large  Scale  Intergration  (FLC/VLSI)  SLM 
[1].  This  device  has  an  FLC  cell  fabricated  on  top  of  a  custom  designed  VLSI  silicon  backplane. 
The  backplane  contains  an  array  of  pixel  memory  elements,  pixel  mirrors  and  addressing  circuitry. 
The  controllable  pixels  in  the  SLM  can  modulate  the  relative  phase  by  exactly  0  or  x  due  to  the 
switchable  uniaxial  nature  of  the  FLC  structure,  as  well  as  perform  amplitude  modulation.  In 
EASLMs  there  is  a  trend  towards  more  pixel  complexity  and  more  intelligent  modulator  arrays, 
which  is  the  primary  advantage  of  the  VLSI  backplane  based  devices.  Techniques  such  as  data 
compression  can  increase  the  reconfiguration  time,  because  the  main  information  ‘bottleneck’  of 
the  EASLM  is  the  transfer  of  electronic  information  to  the  modulator  array  from  ‘off-chip*  sources. 

Generally,  pixellation  is  inherent  in  spatial  light  modulators  and  this  has  major  consequences 
for  their  use  in  Fourier  transform  optical  systems.  The  pixel  transmission  function  determines  the 
‘sine’  envelope  function  in  the  Fourier  plane.  A  100%  fill-factor  in  the  pixel  transmission  function 
results  in  no  replication  at  the  Fourier  plane,  only  the  zeroth  order.  Any  fill-factor  lower  than 
100%  conventionally  results  in  higher  order  replications.  Through  planarisation  techniques,  the 
pixel  fill-factor  can  be  increased  towards  100%,  but  seperation  is  always  required  between  the  pixel 
modulation  elements  to  avoid  inter-pixel  cross  talk. 

2  Manipulation  of  Replications 

The  pixellation  can  be  exploited  to  allow  routing  through  the  manipulation  of  the  replicated 
orders.  In  a  coherent  optical  processor  a  conventional  pixellated  Fourier  plane  filter  will  give  rise 
to  replication  of  the  information  at  the  output.  This  is  due  to  the  output  being  equal  to  the 
input  rotated  by  180  convolved  with  the  Fourier  transform  of  the  filter.  Previous  work  [9]  has 
shown  that  the  delta^function-like  spikes  in  the  power  spectrum  of  the  filter  which  give  rise  to  the 
replication  may  be  attenuated  or  removed  by  selection  of  a  specific  pixel  pitch: size  ratio  and  the 
introduction  of  a  specific  pixel  position  randomisation  scheme  which  applies  a  displacement  of  the 
pixel  centre  from  its  regular  position.  This  generalises  the  expression  for  the  power  spectrum  of 


208 


Figure  1;  a)  Possible  pixel  positions  within  ce  x  a  cell  b)  A  resultant  16x16  array  of  pixels 


the  Fourier  plane  filter  (|f2(®,y)P)  with  all  pixels  ‘on’  to: 


where  P(x,y)  is  the  Fourier  transform  of  the  single  pixel  transmission  function,  p(x)  and  p'(y)  are 
the  Fourier  transforms  of  the  pixel  position  probability  distribution  functions,  a  is  the  pitch  of  the 
underlying  regular  array,  and  the  array  consists  of  Q  x  Q  pixels.  The  spectral  orders  are  placed  at 
(n/a,m/a),  n,m  =  0,  ±1,±2, ...  With  square  pixels  of  side  a/2,  P(x,y)  =  which 

has  zeroes  at  the  positions  of  all  the  even  numbered  spectral  orders  other  than  the  zero  order. 
Further,  choosing  p(x)  =  cos(^x)  and  p(y)  =  cos(~y),  we  find  that  the  allowed  displacements  of 
the  pixel  from  the  regular  position  are  such  that  each  pixel  can  take  one  of  four  allowed  positions  in 
an  a  X  a  square,  (figure  1)  and  most  importantly  the  odd  numbered  spectral  orders  in  both  x  and 
y  directions  are  eliminated  by  zeroes  of  p(x)  and  p(y).  If  however  we  retain  periodicity  in  either 
the  s-direction  (p(x)  =  1)  and/or  the  y-direction  (p(y)  —  1),  the  first  orders  on  the  respective 
axes  will  remain.  The  power  in  the  attenuated  orders  is  redistributed  into  a  diffuse  background 
~  1/Q^  times  the  zero-order  intensity  (~  OA/Q^  times  the  first  order  intensity). 

The  suggestion  here  is  that  a  means  exists  whereby  information  can  be  routed  via  certain  of 
the  first  order  replicas,  dependent  on  the  distribution  of  the  transmitting  subpixel  in  each  cell  of 
the  array-  with  the  discretely  randomised  array  described  all  first  orders  would  be  ‘off’;  by  using 
arrays  with  periodicity  with  respect  to  the  axes  the  first  orders  on  the  ®-axia,  or  y-ctxis  (or  both, 
if  the  array  is  regular)  would  be  ‘on’.  (Figure  2). 

There  are  several  points  to  note.  First,  the  idea  lends  itself  to  low  resolution  SLMs:  using  the 
16x16  silicon  backplane  SLM  designed  by  Underwood  [3],  a  2x2  block  of  pixels  could  be  used  to 
code  each  of  the  ck  by  a  cells,  giving  an  8x8  array  of  cells  with  one  pixel  ‘on’  in  each  cell.  This  would 
then  give  a  background  illumination  due  to  the  ‘off’  state  of  the  first  order  ~  26  times  less  than 
the  ‘on’  state,  a  difference  in  states  which  should  be  simple  to  threshold.  Further,  the  SLM  would 
be  capable  of  routing  8x8  pixel  arrays  when  used  in  the  Fourier  pl^e  of  a  4-/  optical  proccMor, 
offering  the  benefits  of  parallelism  and  a  modest  degree  of  image  fanout.  These  properties  are 
scaleable  for  larger  arrays  [1],  with  the  added  benefit  that  the  signal  (intensity  of  the  ‘on’  first 
order  replicas)  to  noise  (the  diffuse  background)  is  proportional  to  the  inverse  square  of  the  array 
size  Q.  Also,  all  the  ‘on’  first  order  replicas  have  the  same  intensity  regardless  of  whether  there 
are  zero,  two  or  four  of  them  (to  a  good  approximation):  this  would  not  be  the  case  if  the  routing 
were  achieved  simply  by  writing  stripe  patterns  on  the  SLM  to  approximate  diffraction  gratings, 
as  the  total  number  of  transmitting  pixels  would  vary  depending  on  the  desired  number  of  replicas. 
Lastly,  this  process  has  been  described  in  terms  of  amplitude  modulating  SLMs.  Phase  modulating 
SLMs  are  equally  applicable  and  offer  the  prospect  of  eliminating  the  zero  order. 


(M*,y)P)  =  |P(a;,y)|’xQ= 


1  -  I.(*)PI.(V)P  + 

sm  (wax)  sin  (way) 


3  Computer  Generated  Holograms 

CGHs  allow  reconfigurable  routing  within  the  orders  themselves.  This  is  presented  using  a  high 
performance  silicon  backplane  device,  which  has  been  planarised  to  give  a  large  phase-flat  fill-factor. 
The  backplane  used  is  a  176  x  176  pixel  DRAM  backplane.  The  pixels  have  a  30;im  pitch  jmd  the 
pixel  mirror  before  planarisation  is  14/ifTi  x  22pn%  i.e.  30%  fill-factor.  The  device  frame  rate 
is  IkHz,  limited  by  the  interfacing  electronics.  When  CGHs  were  displayed  on  the  device  before 
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Figure  2:  Selective  removal  of  aliasing  to  provide  selective  fanout/routing 


Figure  3:  Diffraction  from  a  simple  grating  (a)  unplanarised  EASLM,  (b)  planarised  EASLM. 

planariaation,  poor  results  were  obtained  [4].  This  was  mainly  due  to  the  low  pixel  fill-factor,  but 
also  the  poor  FLC  alignment  caused^  by  the  low  metal  quality.  The  mirrors  are  fabricated  using  a 
standard  VLSI  aluminium  metal  layer,  so  the  layer  is  deposited  with  electrical  conductivity,  rather 
than  optical  quality,  as  the  main  criterion.  The  mirrors  are  not  optically  flat  and  hence  scatter  light 
to  give  noise  in  the  reconstruction.  The  rough  surface  profiles  of  the  mirrors  also  prevent  uniform 
high  quality  alignment  of  the  FLC  over  the  array.  The  low  fill-factor  resulted  in  very  prominent 
replicated  orders  and  a  very  high  d.c.  spot.  Hence,  the  diffraction  efficiency  or  the  quantity  of 
light  measured  where  it  was  intended,  was  very  low. 

Planarisation  techniques  have  been  used  to  overcome  the  problems  associated  with  the  VLSI 
fabricated  circuitry  and  mirrors  by  burying  them  beneath  a  polished  dielectric  film,  depositing 
high  quality  metal  mirrors  on  the  surface  and  providing  small  interconnection  vias  through  the 
dielectric  layer  between  the  original  and  new  layers  [5].  These  techniques  have  been  applied  to  the 
device  described  above  and  have  increased  the  phase-flat  fill-factor  to  81%  [6]. 

The  resulting  improvement  to  the  performance  of  the  device  in  a  Fourier  transform  system  was 
examined.  A  FLC  cell  was  fabricated  on  top  of  a  planarised  device  and  a  non-planarised  device 
under  identical  conditions.  A  simple  grating  structure  (with  an  8  pixel  pitch)  w&b  displayed  on  both 
devices  and  the  resulting  Fourier  transforms  are  shown  in  figures  3(a)  and  3(b).  The  planarised 
device  has  a  very  large  d.c.  term  (central  peak),  very  low  diffraction  peaks  and  very  prominent 
replications  (not  shown).  The  planarised  case  shows  considerable  improvement,  with  the  d.c.  term 
much  reduced  and  the  diffracted  peaks  the  most  prominent. 

The  display  of  CGHs  on  the  planarised  FLC/ VLSI  EASLM  can  produce  interconnection  or 
fanout.  The  COHs  have  been  designed  by  simulated  annealing  to  give  arrays  of  spots  from  a  plane 


Figure  4:  A  4  -  4  fanout  pattern  produced  by  a  CGH  on  the  EASLM. 


wavefront  [4],  In  the  4-f  coherent  processor,  the  input  information  would  be  routed  to  the  position 
of  the  spots  in  the  output  plane.  By  changing  the  CGH  pattern  on  the  EASLM,  the  position 
and  number  of  the  spots  can  be  altered  and  hence  route  the  input  information  as  required  by  the 
system,  A  CGH  pattern,  designed  through  simulated  annealing  to  produce  a  4  x  4  fanout,  was 
displayed  on  the  EASLM.  The  resulting  fanout  obtained  after  Fourier  transformation  by  a  simple 
lens  (f=  140mm)  is  shown  in  figure  4.  The  d.c,  spot  is  very  low,  and  the  diffraction  efficiency 
(light  in  the  16  peaks/total  light  in  the  Fourier  plane)  is  ^  65%.  Replicated  orders  are  still 
evident  as  the  pixel  function  has  <  100%  fill-factor,  but  they  are  very  low  as  the  efficiency  shows. 

The  VLSI/SLM  is  a  very  attractive  device  for  this  application,  due  to  the  potential  level  of 
addressing  complexity  available,  which  allows  very  high  reconfiguration  times.  Future  devices  will 
have  smaller  pixel  pitches  (more  compact  systems),  a  higher  space-bandwidth  product  (a  higher 
number  of  pixels)  and  improved  optical  quality.  There  is  also  the  potential  for  storage  of  sets 
of  CGH  patterns  ‘on-chip’,  i.e.  designed  into  the  backplane  for  a  specific  system  requirement, 
which  would  increase  considerably  the  reconfiguration  time,  as  only  ‘local’  data  transfer  need  be 
performed 

4  Conclusions 

Both  of  the  techniques  are  complementary,  allowing  the  manipulation  of  tlie  replicated  orders, 
and  of  information  inside  the  orders.  The  pixellation  of  the  EASLM  has  been  shown  to  be  a  very 
important  parameter  in  terms  of  both  the  performance  and  capabilities.  Control  of  the  pixellation 
is  crucial  in  future  devices  for  high  performance  in  optical  computing  systems.  These  conclusions 
are  being  considered  as  the  next  generation  of  EASLM  devices  are  being  developed . 
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Abstract.  An  optical  package  for  a  reconfigurable  free— space  optical  intercon¬ 
nection  network  based  on  a  novel  (2,2,2)  switching  node  for  the  multiprocessor  sys¬ 
tem  is  presented.  A  novel  photo  — electronic  hybrid  switching  node  which  is  consist 
of  SEED  and  S— SEED  devices  is  described.  The  optical  package  of  one  stage  in  a  re¬ 
configurable  interconnection  network  is  demonstrated. 


1.  Introduction 

The  interest  has  recently  grown  in  reconfigurable  optical  interconnects  and  optical 
switching  networks  for  multiprocessor  and  telecommunication  systems 

In  this  paper,  we  describe  an  optical  package  of  reconfigurable  free-space  optical 
crossover  interconnection  network  for  multiprocessor  systems.  The  link  interinter- 
connection  stages  are  implemented  by  some  monolithic  integrated  free-space  digital 
optics.  A  novel  photo-electronic  hybrid  node  type  of  (2,2,2)  is  used  for  a  switching 
node,  in  which  the  hardware  comprises  two  monolithic  self-electro-optic-effect  de¬ 
vices,  one  electronic  addressing  (SEED)  and  one  S-SEED.  The  optical  package  of 
one  interconnection  stage  in  reconfigurable  interconnection  network  is  demonstrated. 
Some  preliminary  results  for  the  functionalities  of  the  switching  network  are  report¬ 
ed. 


2.  The  architecture  of  optical  interconnection  network  package 

A  free-space  multistage  interconnection  network  consists  of  the  node  stages  and  link 
stages.  In  our  experiment  system,  an  opto-electronic  hybrid  (2,2,2)  node  with  elec¬ 
tronical  signal  control  as  shown  in  Fig.  1  is  presented  for  the  switching  node  of  the 
interconnection  network;  its  optical  hardwares  consist  of  two  electronically  address¬ 
ing  SEEDs  and  one  S-SEED.  The  electronic  addressing  signals  that  must  be  sent 
from  the  controlling  processor  of  the  network  are  offered  for  controll  signals  of  data 
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paths  t  one  SEED  con  trolls  straight  paths,  another  SEED  controlls  exchange  paths. 
The  S-SEED  device  operating  in  a  state  of  OR  logic  is  used  as  cascading. 


SEEDl  SEED2 


Fig.  1,  The  opto-electronic  hybrid  (2,2,2)  switching  node. 

An  optical  crossover  network  based  on  photo-electronic  hybrid  (2,2,2)  switch- 
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Fig.  2  An  optical  crossover  interconnection  network  based  on  photo  -electronic 
hybrid  (2,2,2)  switching  node 

ing  nodes  is  presented  as  a  reconfigurable  optical  interconnection  network  as  shown 
in  Fig.  2.  A  reconfigurable  optical  switching  network  with  N  ports  must  be  a  non- 
blocking  switching  network  system  that  comprises  of  2  Log2N  -  1  stages ,  in  which 
optical  systems  in  each  stage  have  the  same  hardware  components  except  for  differ¬ 
ent  periods  of  prismatic  mirror  arrays.  The  optical  package  of  each  stage  as  shown  in 
Fig.  3  consists  of  phase  fresnel  lenslet  arrays  (F),  a  Prismatic  grating  mirror  array 
(PG)  ,  the  beam  splitters  (BS)  ,  the  quarter  waveplates ,  SEED  devices ,  and  S-SEED 
devices.  The  optical  system  is  based  on  the  principle  of  focal  plane  imaging.  The 
phase  fresnel  lenslet  array  which  splits  one  collimated  laser  beam  with  homogeneous 
power  distribution  in  some  region  to  a  light  spot  array  on  the  focal  plane ,  also  is  used 
for  imaging  lenslet  array  which  images  the  light  spot  on  each  mesa  of  SEED  in  each 
channel  on  each  window  of  S-SEED  substrates ,  or  vice  versa.  The  optical  system  of 
the  switching  network  shown  in  Fig.  3  consists  of  three  parts.  The  first  block  which 
includes  BSi  ,Fi ,  PG  and  M  is  called  the  optical  link  stage.  The  two  connection  types 
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PG 


SEED2  S  -SEED 


Fig.  3  The  layout  of  the  optical  package  of  free-space  crossover  interconnection 
network 

of  the  straight  and  the  exchange  in  the  crossover  network  are  implemented  in  the 
first  block.  The  second  block  which  consists  of  BS2»  SEEDi  and  SEEDg  is  called  the 
control  stage  of  routing.  Whether  the  straight  or  the  exchange  channels  would  be 
routed  from  the  input  to  the  output  port  at  a  time  is  determined  by  applied  electronic 
signals  on  the  pixels  of  SEEDi  and  SEED2.  The  third  block  includes  BS3 ,  F2 »  F3 , 
PMi »  PM2 »  and  S-SEED  is  called  the  cascading  stage.  The  S-SEED  device  operates 
in  an  optical  set-reset-latch  that  can  be  made  to  simulate  an  OR  logic  gate.  The  input 
signals  are  amplified  and  routed  to  the  next  stage.  As  viewed  from  the  operation 
function,  the  second  and  third  blocks  are  the  optical  hardwares  of  the  optical  switch¬ 
ing  nodes  as  shown  in  Fig.  1. 

The  optical  network  has  two  connection  typies  of  the  straight  and  the  exchange. 
The  straight  connections  are  implemented  by  the  route  of  Fi-BSi-M-BS2-SEEDi~F2- 
PM1-BS3-S-SEED.  The  exchang  connections  are  implemented  by  the  route  of  Fi-BSr 
PG-BS2-SEED2-F2-BS3-PM2-S-SEED.  The  4-level  phase  fresnel  lenslet  arrays  Fi, 
and  F2  are  imaging  lenslet  arrays.  Fj  images  the  light  spots  from  each  mesa  of  S- 
SEED  device  in  the  previous  stage  on  each  mesa  of  SEEDj  and  SEED2  substrates. 
The  F2  images  the  pattern  of  SEED]  and  SEED2  substrates  on  the  windows  of  the  S- 
SEED.  The  function  of  4-level  phase  fresnel  lenslet  array  Fc  is  the  generation  of  light 
spot  array.  It  splits  a  collimated  laser  beam  with  homogeneous  power  distribution  in 
some  region  into  light  spot  array  on  the  focal  plane  which  is  located  at  the  window 
plane  of  S-SEED.  The  light  spot  array  is  called  the  clock  light  array  to  read  out  the 
states  S-SEED  to  the  next  stage.  In  the  optical  interconnection  network  of  8  x  8,  F] 
and  F2  are  phase  fresnel  lenslet  array  of  8  x  8,  Fc  is  a  lenslet  array  of  8  x  16.  It  is  ob¬ 
vious  that  route  of  seleetion  is  controlled  with  electronic  addressing  signals  applied 
on  the  pixels  of  SEED]  and  SEED2.  Then  the  data  signals  are  transfered  to  S-SEED 
device,  read  out  by  the  clock  light  array  and  are  amplified  to  the  next  stage.  The 
eletronic  addressing  signals  come  from  personal  computer  which  perform  path  hunt¬ 
ing.  Based  on  the  architecture  detailed  above,  we  fabricated  the  optical  package  of 
one  stage  in  a  reconfigurable  free-space  optical  switching  network  of  8  x  8.  The 
space  between  adjacent  channels  is  0.  4  mm.  All  8x8  channels  are  arranged  in  the 
areas  of  3.  2  x  3.  2  mm^.  The  insertion  losses  of  the  optical  switch  device  SEED  is 
about  40  %.  The  switch  time  of  SEED  is  about  10  ns  when  device  are  operated  in 
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normal-on  state.  When  the  S-SEED  device  was  replaced  by  the  testing  array  whose 
response  time  is  about  2  ns ,  some  basic  operation  performances  of  the  optical  switch¬ 
ing  node  (  2.2,2  )  and  optical  interconnection  network  were  demonstrated.  Data 
packets  which  include  8  bits  of  data  signals  come  from  a  personal  computer  are  paral- 
lelly  trasported  through  the  optical  package  of  interconnection  network  from  the  in¬ 
put  ports  to  the  output  ports. 

3.  Conclusion  and  discussion 

The  optical  systems  based  on  the  principle  of  focal  plane  imaging  in  which  a  phase 
fresnel  lenslet  array  is  used  as  the  imaging  lenslet  array,  this  optical  system  has  the 
following  characteristics ,  the  focal  longth  of  phase  fresnel  lonslet  array  can  be  made 
very  short,  the  design  of  optical  system  is  very  compact,  in  which  all  optical  compo¬ 
nents  can.  be  fixed  by  optical  glue  to  form  a  package  architecture.  By  comparison 
with  an  infinite  conjugate,  telecentric,  4f  imaging  system,  this  architecture  avoids 
aberration  accumulation  and  alignment  difficulty.  Thereby  the  architecture  of  the  op¬ 
tical  package  has  some  features,  such  as  small  physical  size,  more  tolerance  of  align¬ 
ment,  shorter  alignment  time,  high  stability,  and  high  reliable.  The  optical  power 
loss,  the  density  and  the  amount  of  optical  interconnection  channels  are  perhaps  the 
major  limitations  of  this  optical  package  architecture.  There  are  several  ways  for  de¬ 
creasing  the  optical  power  loss  in  the  system , decreasing  the  loss  in  the  SEED,  in¬ 
creasing  the  high  state  reflectivity  of  the  SEED,  increasing  the  precision  of  photo- 
olithographic  and  processing.  It  is  a  important  issue  in  the  current  system  that  how 
to  get  a  collimated  laser  beam  with  homogeneous  power  distribution  in  the  required 
size  for  clock  beam.  The  best  way  is  that  the  vertical  cavity  surface  emitting  lasers  is 
used  as  the  reading  beams,  because  the  surface  emitting  microlaser  not  only  can  pro¬ 
vide  high  power,  but  also  can  be  easy  to  form  a  collimated  and  homogeneous  laser 
beam- 
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Abstract.  We  present  a  free-space  network  architecture,  supporting  commu¬ 
nication  rates  of  greater  than  1  Gbit/Sec  per  port.  Using  today’s  technology, 
our  network  scales  to  thousands  of  ports  and  is  useful  for  massively  parallel 
processing  architectures.  A  64-ports  prototype  was  demonstrated. 


1.  Introduction 

The  interconnection  network  plays  an  important  role  in  the  overall  performance  of  par¬ 
allel  processing  systems.  Optics  has  been  suggested  as  a  potentially  superior  technology 
over  electronic,  for  interconnecting  thousands  of  processing  elements  (PEs)  in  a  mas¬ 
sively  parallel  processing  system  (MPP).  In  an  MPP  system,  a  PE  may  be  built  around 
a  state-of-the-art,  off-the-shelf  processor  (Digital  “Alpha”,  MIPS  R4400,  etc.).  To  serve 
the  interconnection  needs  of  today’s  and  near  future  processors  (“Intel  2000  CPU”), 
that  are  being  and  will  be  used  for  building  MPP  systems,  a  high  performance  network 
is  needed.  Such  processors,  having  a  performance  rate  of  one  to  two  thousands  MIPS 
(million  instructions  per  second)  on  words  of  information  that  are  64  to  128  bits  long, 
generate  a  total  throughput  of  128  to  256  Gbit/Sec.  Not  all  of  this  traffic  needs  to 
be  handled  by  the  network.  A  fraction  of  this  (e.g.,  10  GBit/Sec  per  PE)  may  cause 
difficulties  in  large  systems  with  thousands  of  PEs. 

A  network  can  be  characterized  by  several  parameters:  Number  of  ports;  band¬ 
width  capacity  per  port;  the  network’s  structure;  latency  delays:  from  one  port  to  an¬ 
other,  without  arbitration  (i.e.,  circuit  switching  or  reconfigurable  mode  of  operation); 
and  routing  and  switching  delays  (caused  by  arbitration  and  other  traffic  specific  issues, 
such  as  blocking,  and  also  depends  on  the  network  structure). 

It  is  generally  accepted  that  for  large  networks  (thousands  of  ports),  optics  is  better  than 
electronics  in  providing  high  bandwidth  connections.  However,  it  is  also  commonly  ac¬ 
cepted  that  today,  optics  cannot  compete  with  electronics  for  general  purpose  processing 
because  of  higher  cost  and  the  immaturity  of  the  optics  technology. 
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2.  A  reconfigurable  free-space  network 

Our  proposed  optical  network  is  based  on  several  basic  principles  and  observations  on 
the  needs  and  possible  tradeolfs  of  an  optical  MPP  network: 

1.  “Logic-Less”  optical  operation:  Optical  technology  is  used  for  point-to-point  high 
bandwidth  connectivity  and  not  to  do  dynamic  routing  or  arbitration. 

2.  Distributed  Layout  and  Control:  No  centralization  in  terms  of  one  common  element 
(e.g.  a  lens)  or  active  device  (e.g.,  one  array  of  VCSELs).  No  central  arbitration 
and  control  is  needed.  The  optical  network  scales  in  proportion  to  the  number 
of  PEs  (although  for  practical  reasons,  one  would  choose  to  use  fewer,  but  larger 
active  devices  such  as  VCSEL  arrays). 

3.  Reconfiguration  is  not  needed  very  often  as  many  parallel  applications  exhibit 
switching  locality:  i.e.,  a  PE  will  only  need  to  communicate  with  a  small  number 
of  other  PEs  for  a  period  of  time.  Choosing  between  these  PEs  can  be  done  by  the 
electronic  switches  that  acts  as  “interconnection  caches”  to  the  initial  reconfigura¬ 
tion  of  the  optical  network. 


3.  Folded  Clos  network 


n  X  m  r  X  r 


Figure  1.  A  folded  Clos. 


Figure  2.  Embedding  a  binary  tree. 


Figure  1  presents  the  structure  of  our  network.  The  middle  stage  is  the  optical  network, 
made  of  “logic-less”  switches  (i.e.,  it  is  a  circuit-switching  operation)  using  VCSEL  ar¬ 
rays.  The  first  and  third  stages  (folded  into  one)  are  made  of  small  electronic  crossbar 
switches. 

Such  a  network  can  be  used  to  map  parallel  applications  that  have  switching  local¬ 
ity.  We  have  investigated  the  mapping  issue  previously  and  come  to  the  conclusion  that 
many  classical  communication  patterns  (such  as  trees,  2-D  and  3-D  mashes,  hypercubes, 
pyramids,  mesh  of  trees  etc.),  as  well  as  non-symmetrical  communication  patterns,  can 
be  well  embedded  in  our  structure  [1,  2]. 
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4.  Mapping  and  embedding 

Figure  2  is  an  example  of  the  embedding  of  a  tree  into  our  network.  Note  that  the 
clustering  means  that  nodes  in  the  same  cluster  will  be  mapped  to  boards  that  share  a 
common  small  electronic  crossbar  switch.  The  connections  between  clusters  are  made 
by  reconfiguration  of  the  optical  middle  layer  of  the  Clos  network. 


5.  System  overview 


Figure  3  represents  an  overall  view  of  our  system  with  a  distributed  optical  network. 
PEs  are  grouped  into  boards,  each  having  a  fast  electronic  crossbar  switch.  Boards  are 
arranged  into  columns  that  are  connected  using  optical  free-space  multiple  buses.  Each 
board  can  have  optical  connections  to  few  other  boards  (8  to  16).  The  selection  of 
these  boards  is  done  by  reconfiguration  of  the  optical  interconnections.  This  mode  of 
operation  is  useful  in  many  parallel  applications  that  exhibit  switching  locality  or/and 
have  “phases”  of  computation  with  specific  interconnection  topologies.  An  example  of 
such  applications  using  the  reconfigurable  network  can  be  found  in  [3,  4]. 


Figure  3.  Optical  network. 


Figure  4.  64-ports  prototype. 


In  our  experimental  system,  we  used  VCSEL  arrays  that  are  arranged  as  8  x  8 
devices  with  individual  access  to  each  VCSEL.  Such  individual  access  is  not  needed  for 
our  architecture,  but  these  devices  were  available  off  the  shelf.  However,  by  having  access 
to  individual  VCSELs,  it  was  possible  to  use  only  four  8x8  devices,  each  subdivided 
into  16  separate  zones  each  having  2x2  devices.  Thus  each  PE  can  reconfigure  its 
subdivision  to  connect  to  one  of  3  other  boards.  Hence  the  system  simulated  4  boards 
of  16  PEs  each. 

Figure  4  is  a  picture  of  a  64-ports  prototype  we  have  built  based  on  the  above 
network  architecture.  A  detailed  description  of  this  prototype  and  its  characteristics 
appears  in  another  paper  in  the  proceedings  [5], 


6.  Scalability  and  future  issues 

Future  systems  may  have  larger  numbers  of  PEs.  One  critical  element  is  the  size  and  the 
electrical  connections  for  the  VCSEL  arrays  to  be  used.  Our  prototype  is  an  example 
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of  the  foreseen  maximum  needed  (i.e.,  16  connections  per  VCSEL  array).  We  anticipate 
fewer  electrical  connections  in  future  VCSELS  arrays.  Such  devices  can  be  made  as  an 
X-Y  matrix  or  other  structures  to  control  each  sub-division  on  the  array  for  each  PE. 
Thus  individual  VCSELs  can  be  selected  to  work  and  transmit  data.  By  selecting  a 
certain  VCSEL,  the  connection  to  that  corresponding  Board,  can  be  made  by  the  optical 
layer  of  the  Clos  network.  The  number  of  high  bandwidth  connections  that  need  to 
be  made  to  the  package  of  such  an  array,  will  be  limited  to  at  most  16.  In  addition, 
low  speed  connections  used  for  choosing  the  appropriate  VCSEL  in  each  sub-division 
are  needed.  These  are  low  speed  connections  since  the  reconfiguration  is  expected  not 
to  be  done  very  often.  One  important  issue  to  address  is  that  of  the  power  budget. 
Currently,  we  have  employed  variable  reflectivity  mirror  elements  to  maximize  the  power 
available.  In  addition  to  this,  it  may  be  necessary  to  add  amplification  or  regeneration  of 
the  optical  data,  in  certain  location,  for  a  high  bandwidth  and  low  noise  communication. 
The  performance  of  such  a  system  was  found  to  be  close  to  an  “ideal”  crossbar  (i.e.,  one 
that  has  the  same  communication  delays  while  it  scales  to  large  sizes)  [6]. 


7,  Summary 

We  have  presented  the  system  structure  of  a  prototype  for  an  optical  interconnection 
network  to  be  used  for  MPP  architectures.  Our  system  benefits  from  the  high  connectiv¬ 
ity  and  throughput  optics  can  offer  while  limiting  the  processing  needed  for  routing  to 
electronic  technology.  Many  parallel  applications  can  use  such  a  reconfigurable  network 
as  they  naturally  exhibit  switching  locality  in  their  communication  needs.  We  hope  to 
extend  our  system  in  the  future  to  accommodate  thousands  of  PEs  in  a  multi-Gigabit/Sec 
channels. 
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Abstract.  Reconfigurable  optical  interconnections  by  using  double  phase  conjugate  mirror  in 
photorefractive  BijjTiOjo  fibre  are  demonstrated.  Recyclable  phase  conjugate  mirror  with  con¬ 
version  efficiency  up  to  8%  has  been  recorded  in  the  fibre  in  a  wide  range  of  incidence  angles 
of  the  pumps  at  X  -  632.8  nm.  Practically  it  means  that  no  adjustment  is  needed  to  create 
interconnection  by  using  mutually  incoherent  beams,  except  introducing  the  pump  beam  into  the 
fibre.  The  response  time  was  measured  to  be  5  seconds  for  pump  beam  intensity  of  1  mW/mm^. 


1.  Introduction 

Photorefractive  crystals  (PRC)  have  been  proved  to  be  significant  in  parallel  optical 
information  processing  owing  to  their  real-time,  high  density,  recyclable  volume  holographic 
recording  and  phase  conjugation.  One  of  the  most  interesting  and  promising  effects  recently 
discovered  in  PRC  is  double  phase  conjugation  of  mutually  incoherent  pump  beams.  For  the 
PRC  illuminated  on  opposite  faces  by  two  pumps,  two  independent  counterpropagating  beams 
build  up,  originating  from  the  beam  fanning.  They  give  rise  to  a  common  grating,  which  is 
reinforced  by  positive  feedback  and  finally  results  in  the  presence  of  a  pair  of  phase 
conjugate  beams  through  a  cross-readout  process  [1].  This  double  phase  conjugate  mirror 
(DPCM)  can  be  used  as  reconfigurable  interconnections  between  two  sets  of  fibres  [2]. 

Experimental  studying  of  the  DPCM  has  been  concentrated  mainly  on  the  research 
of  bulk  crystals.  However,  recently,  the  double  phase  conjugation  in  a  fibre  sample  has  been 
demonstrated  in  our  laboratory  [3,4].  Compared  with  bulk  crystals,  fibres  are  much  easier  in 
production  owing  to  the  Laser  Heated  Pedestal  Growth  (LHPG)  technique,  which  allows  to 
grow  a  single  crystal  fibre  of  better  quality  and  much  faster.  Moreover,  the  physical 
mechanism  of  beam  coupling  in  the  fibre  sample  differs  from  that  in  widely  studied 
photorefractive  bulk  crystal.  In  this  presentation,  the  advantages  of  the  Bi,2TiO20  fibre 
application  to  reconfigural  optical  interconnections  are  demonstrated. 

2.  Experimental  configuration 

2.1.  Optical  scheme  for  interconnection 

As  an  example,  interconnection  between  one  fibre  and  a  set  of  three  fibres  is  shown  on  the 
Fig.l.  Light  emitted  from  the  fibre  (Input  1)  is  focused  by  lens  1  on  the  PRC’s  face.  At  the 
same  time,  the  light  emitted  from  the  set  of  fibres  (Inputs  2-4)  is  focused  by  lens  2  on  the 
opposite  side.  After  a  response  time  x,  common  holographic  grating  is  recorded  in  the  PRC, 
producing  a  pair  of  phase  conjugate  beams,  which  are  automatically  introduced  into  the  fibre 
from  one  side  and  into  the  set  of  fibres  from  the  other  side.  Phase  conjugate  beams  can  be 
extracted  by  using  beamsplitters  BS.  Therefore,  the  signal  from  the  Input  1  achieves  all  the 
Receivers  R2-R4  and  the  sum  of  the  signals  from  the  Inputs  2-4  achieves  Receiver  Rl.  In 
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this  example  the  Input  1  is  interconnected  to  all  the  three  Inputs  2-4.  Interconnections  can 
be  simply  changed  by  introduction  of  light  only  into  two  fibres  (3  and  4,  for  example).  In 
that  case  after  the  response  time  x,  a  new  common  holographic  grating  will  be  recorded  in 
the  PRC  and  new  interconnections  between  Input  1  and  Inputs  3,4  will  be  established. 

The  performance  of  reconfigural  interconnection  depends  on  three  most  important 
parameters  of  the  photorefractive  DPCM.  They  are  the  efficiency  of  phase  conjugation,  the 
response  time,  and  the  spatial  bandwidth.  All  these  parameters  have  been  measured  in  our 
experiments. 

2.2.  Photorefractive  fibres 

Experiments  were  carried  out  on  fiber-like  Bi,2Ti02o  (BTO)  samples.  This  crystal 
belongs  to  the  same  structural  class  (sillenite)  as  Bi,2Si02o-  However,  BTO  has  larger  electro¬ 
optic  coefficient  and  lower  optical  activity  compared  with  other  sillenites,  and  is  more 
sensitive  compared  with  BaTi03  and  SBN  photorefractive  crystals.  The  fibres  were  cut  out 
from  the  bulk  crystal  along  the  [110]  crystallographic  axis  and  were  glued  between  two 
electrodes.  Their  length  is  8-18  mm,  and  they  have  rectangular  cross-section  with 
approximately  1  mm  size.  The  alternating  bipolar  electric  field  of  square-wave  form  with 
pulse  duration  of  14  ms  and  with  front  duration  of  2.5  ms  could  be  applied  to  the  crystal  so 
that  either  the  plane  { 1 1 0 }  or  plane  [111}  would  be  perpendicular  to  the  electric  field  vector. 
Later  orientation  is  known  to  produce  in  BSO-type  crystals  the  highest  energy  exchange 
effect  along  the  applied  field  vector  [5].  The  electrodes  were  made  1.2  mm  shorter  than  the 
sample  length  and  covered  completely  with  epoxy  glue  to  prevent  electric  breakdowns  via 
air  and  crystal’s  surfaces.  Therefore,  as  high  voltage  as  5  kV  (which  corresponds  to  a  50 
kV/cm  bias  electric  field)  could  be  applied  to  the  fibre. 

It  is  known  [6]  that  the  hologram  recording  in  the  BSO-type  photorefractive  crystals 
under  ac  electric  field  results  in  the  formation  of  the  refractive  index  grating,  which  is  90° 
phase  shifted  in  respect  to  the  illuminating  interference  pattern.  This  condition  is  essential 
to  produce  strong  energy  exchange  between  the  pump  beam  and  the  scattered  light,  resulting 
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on  the  appearance  of  strong  fanning  light,  which  is  the  origin  of  double  phase  conjugation. 

Four  longer  faces  of  the  fibre  were  optically  polished,  but  end-faces  were  slightly 
grounded  to  produce  a  scattering  of  transmitted  beam  into  the  angle  of  approximately  4 
degrees.  It  was  especially  made  to  introduce  pump  beams  of  complex  (speckle-like) 
wavefront  into  the  fibre,  which  allowed  us  to  avoid  conical  degeneracies  in  the  Bragg 
matching  angle  on  a  given  volume  grating.  Two  independent  linearly  polarized  helium-neon 
lasers  (X,  =  632.8  nm)  were  used  to  pump  the  BTO-fibre.  The  input  polarization  angles  were 
chosen  experimentally  to  produce  as  high  phase  conjugate  reflectivity  as  possible. 

3.  Experimental  results 

3.1.  Conversion  ejficiency 

It  is  very  easy  to  record  the  DPCM  in  photorefractive  fibre.  To  this  aim  it  is  enough  to 
introduce  into  the  fibre  from  its  opposite  end  faces  two  mutually  incoherent  (derived  from 
independent  lasers)  pump  beams.  We  have  found  that  both  types  of  fibres  (with  external  field 
being  parallel  to  <110>  and  <11 1>  axis)  possess  high  enough  conversion  efficiency  to 
produce  phase  conjugation  of  complex  images  [4].  The  spatial  resolution  of  phase  conjugated 
images  as  estimated  from  the  experimental  photographs  is  approximately  50  Ip/mm.  The 
experiments  with  phase  conjugation  of  images  have  been  made  with  <11 0>  orientated  fibre. 

To  the  other  hand,  we  have  found  that  the  DPCM  recorded  in  BTO  fibre  of  <111> 
orientation  has  much  wider  band  of  pump  incidence  angles  than  it  was  in  our  previous  works 
[4,7].  Experimental  results  are  shown  in  Fig.2,  where  a  conversion  efficiency  in  steady  state, 
r|  is  plotted  in  form  of  a  contour  map  as  a  function  of  incidence  angles  of  the  pump  #1  (aj) 
and  the  pump  #2  (a2).  Here  we  define  ^  as  S,/p2,  where  S,  is  the  power  of  the  beam,  which 


Fig.2.  The  DPCM  conversion  efficiency  as  a  function  of  pumps  incidence. 
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is  a  phase  conjugate  replica  of  the  pump  #1  and  Pj  is  the  input  power  of  the  pump  #2.  We 
should  emphasize  that  our  definition  of  t]  does  not  take  into  account  the  high  losses  of  light 
power  (20%  for  each  reflection),  owing  to  high  refractive  index  of  BTO  crystal  (n  =  2.58  at 
X  =  632.8  nm).  Therefore,  antireflection  layers  evaporated  on  the  fibre  end-faces  can 
significantly  improve  the  efficiency  of  the  DPCM.  As  one  can  see  from  Fig.2,  the  DPCM 
with  efficiency  more  than  2%  can  be  obtained  while  changing  the  pump  #1  incidence  from  - 
15”  to  18”  and  the  pump  #2  incidence  from  -12°  to  12°.  The  numbers  on  the  lines  of  Fig.2 
show  the  efficiency  in  per  cent.  The  maximal  conversion  efficiency  is  8%.  Practically  it 
means  that  no  adjustment  is  needed  to  get  an  effective  interaction  of  mutually  incoherent 
beams,  except  introducing  the  pump  beam  into  the  BTO  fibre. 

3.2.  Response  time 


In  spite  of  rather  uniform  steady-state  value  of  the  DPCM  conversion  efficiency,  its  temporal 
behaviour  varies  significantly  with  incidence  angle  a,  and  0t2  of  both  pumps.  Usually,  the 
response  time  is  inversely  proportional  to  the  intensity  of  pump  beam,  but  it  was  found  that 
it  also  depends  on  the  power  ratio  of  pump  beams  [7],  approaching  a  minimal  value  for  equal 
intensity  of  pump  beams.  It  takes  5  sec  to  reach  the  maximum  of  conversion  efficiency  with 
pump  beam  intensity  of  1  mW/mm^. 

4.  Conclusion 

The  application  of  Bi,2TiO20  photorefractive  fibre  to  reconfigural  optical  interconnections  has 
been  proposed.  The  BTO  fibres  can  be  assembled  in  a  variety  of  configurations,  isolating 
each  stack  of  interconnections  from  other  stack  in  adjacent  fibres.  Moreover,  they  can  be 
easily  integrated  with  semiconductor  lasers.  This  can  provide  realization  of  great  amount  of 
interconnected  fibres  with  a  possibility  of  independent  reconfiguration  of  different  groups  of 
them. 
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Abstract 

The  optical  and  optomechanical  design  of  a  representative  portion  of  a  lenslet  array  based  free- 
space  photonic  backplane  is  described.  Issues  relating  to  the  optical  design,  optomechanical 
layout  and  alignment  tolerances  wUl  be  discussed. 

Introduction 

The  following  paper  will  consider  the  optical  and  optomechanical  design  of  a  unidirectional 
lenslet  array  based  PCB-to-PCB  interconnection  link.  The  system  uses  FET-SEED 
modulators  and  receivers  to  transmit  data  between  two  circuit  boards  and  multi-level 
diffractive  lenslets  to  interconnect  the  device  planes.  Differential  logic  is  used  to  minimize  the 
sensitivity  of  the  system  to  optical  power  supply  variations,  thus  each  data  channel  requires 
two  optical  signal  beams  to  encode  the  information.  The  design  of  the  FET-SEED  arrays  and 
the  performance  of  the  backplane  demonstrator  are  described  elsewhere  in  these 
proceedings[l].  The  aim  of  this  work  is  to  develop  the  technology  required  to  implement  a 
reconfigurable  terabit  capacity  photonic  backplane  [2]. 

Optical  Interconnect  Design 

Integrating  a  free-space  photonic  backplane  into  a  commercial  electronics  environment  will  be 
a  challenging  task.  Current  electronic  backplane  standards  typically  demand  a  board-to-board 
separation  of  the  order  of  one  inch.  Any  competitive  photonic  backplane  must  therefore  be 
compact  enough  to  fit  within  this  space.  The  optics  must  be  designed  such  that  it  is  possible 
to  remove  and  insert  circuit  boards  while  maintaining  precise  registration  between  the  optical 
signal  beams  and  the  transmitter  and  receiver  arrays.  In  addition,  alignment  should  be 
unaffected  by  temperature  variations  and  external  vibrations.  This  places  exacting  demands  on 
the  optical  and  optomechanical  design  of  the  backplane. 

Figure  1  shows  the  optical  layout  of  the  photonic  backplane  demonstration  circuit.  Light 
from  a  Ti-Sapphire  laser  operating  at  850nm  enters  the  circuit  via  a  single  mode  polarization 
preserving  fiber.  The  light  is  collimated  and  passes  though  a  non-separable  binary-phase 
grating  which  generates  an  array  of  optical  beams  at  the  input  to  the  optical  interconnection 
assembly.  The  beams  are  collimated  by  lenslet  array  Li,  reflected  by  the  polarizing  beam 
splitter  (PBS)  and  focused  by  lenslet  array  L2  onto  the  array  of  FET-SEED  quantum  well 
modulators  (figure  2).  The  intensity  modulated  signal  beams  pass  back  through  the  PBS  (the 
polarization  of  the  light  having  been  rotated  by  a  quarter  waveplate)  and  are  finally  focused  by 
lenslet  array  L3  onto  the  array  of  receivers.  The  layout  of  a  single  pixel  is  shown  in  figure  3. 
Due  to  the  close  proximity  of  the  window  pairs,  each  lenslet  must  simultaneously  carry  two 
signal  beams.  It  was  therefore  decided  to  use  a  telecentric  optical  relay  to  ensure  a  high  optical 
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throughput,  simplify  the  optical  design  and  maximize  the  tolerance  of  the  interconnect  to 
misalignment  and  fabrication  errors. 


lUumination  system 


Figure  1  -  Optical  layout  ofPCB-to-PCB  interconnection  link. 


Focussed  beam 


Figure  2  -  Optical  interconnect  layout.  Figure  3  -  Pixel  window  layout 

The  optimum  lenslet  array  geometry  may  be  calculated  by  using  a  gaussian  beam  analysis 
to  find  the  dependence  of  the  window  size,  dw,  on  channel  density,  Cd  (channels  per  cm^). 
The  analysis  presented  here  assumes  the  use  of  square  packed  lenslet  arrays  with  dimensions 
of  Di  =  Dp.  In  addition,  the  restrictions  wl  =  1/3  [Dp  -  (dg  +  dw)]  and  3wo  =  dw  have  been 
imposed,  where  wl  is  the  gaussian  (l/e^)  beam  radius  at  the  lenslet  array  plane  and  Wo  is  the 
beam  radius  of  the  focused  beam.  These  limits  ensure  that  minimum  clipping  of  the  beam 
occurs  as  it  propagates  through  the  interconnect  [3].  Figure  4  shows  the  variation  of  dw  with 
Cd  as  a  function  of  window  spacing  for  a  relay  with  focal  length  of  f  =  6.5mm  (to  obtain  a 
PCB-to-PCB  pitch  close  to  current  backplane  standards)  and  an  operating  wavelength  of 
850nm.  Higher  channel  densities  can  be  obtained  using  a  multi-channel  (or  super  pixel) 
lenslet  array  configuration  in  which  the  windows  of  four  adjacent  pixels  are  clustered  together 
and  the  optical  signals  relayed  through  a  single  lenslet  facet  having  dimensions  of  Dl  =  2  Dp. 
The  smart  pixel  layout  is  illustrated  in  figure  5.  Using  a  similar  gaussian  beam  analysis  it  can 
be  shown  that  a  multi-channel  relay  with  a  channel  density  of  1000  channels/cm^  may  be 
implemented  using  a  lenslet  array  with  focal  length  of  f  =  6.5mm,  dg  =  20p.m,  an  operating 
wavelength  of  850nm  and  a  window  size  of  34pm. 
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Figure  4  -  Variation  ofd^:  with  Cq:  f  =  6.5mm,  A  =  850nm,  ds  =  10,  20,  30  and  40fim. 

Optical  Demonstrator  Design 

The  FET-SEED  arrays  employed  in  this  prototype  backplane  demonstrator  had  a  pixel 
separation  of  200jLim,  a  window  size  of  25|xm,  a  window  spacing  of  50p.m  and  an  operating 
wavelength  of  850nm  [1].  The  optics  were  designed  to  connect  a  2x2  array  of  channels  spaced 
by  600pm  with  a  device  plane- to-de vice  plane  separation  of  just  over  33mm.  Multi-level 
diffractive  lenslet  arrays  fabricated  at  Heriot-Watt  University  were  used  to  implement  the 
interconnect  [4].  These  had  eight  phase  levels,  a  focal  length  of  6.5mm  and  a  measured 
throughput  of  90%.  Optical  simulations  of  the  set-up  described  in  figure  1  predict  a  theoretical 
gaussian  l/e^  beam  radius  of  9.98pm  and  a  geometric  spot  size  of  2.59pm  at  the  receiver 
plane. 

Optomechanics  and  Alignment 

One  of  the  disadvantages  of  a  lenslet  array  based  optical  interconnect  is  its  sensitivity  to 
misalignment.  An  analysis  of  the  systems  optical  alignment  tolerances  was  carried  out  to 
determine  the  effect  of  a  specific  positional  error  on  optical  throughput  (Table  1) .  The  results 
of  these  calculations  were  used  to  design  an  optomechanical  set-up  capable  of  meeting  the 
required  alignment  tolerances.  The  illumination  and  imaging  systems  shown  in  figure  1  were 
also  constructed  to  allow  the  receiver  and  transmitter  arrays  to  be  viewed  during  assembly. 

Meeting  the  necessary  alignment  tolerances  and  space  constraints  required  the  development 
of  custom  optomechanics.  The  optical  components  were  mounted  in  steel  cylinders, 
prealigned  and  positioned  on  a  magnesium  baseplate  using  a  magnetic  slot  channel 
arrangement  [5].  The  magnets  were  mounted  on  top  of  steel  bars  to  increase  the  restraining 
force.  This  force  can  be  varied  by  altering  the  thickness  of  the  bars.  A  chamfered  groove 
structure  was  chosen  to  reduce  wear  and  improve  alignment  tolerances  (figure  6). 

The  PCBs  were  attached  to  commercial  5-axis  kinematic  translation  stages  bolted  to  the 
baseplate.  These  were  adjusted  to  get  the  required  registration  between  the  two  device  planes. 
Alignment  of  the  input  beams  with  respect  to  the  modulator  array  was  provided  by  a  Risley 
beam  steerer  pair.  The  holders  for  these  elements  employed  ball  bearing  mounts  to  reduce 
wear  and  improve  rotational  alignment  accuracy. 
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Figure  5  -  Multi-channel  interconnect. 


System  parameter 

Focal  length  of  biHk  lenses 

±0.25% 

Wavelength 

±1  nm 

Ax,  Ay  of  input  array 

±5^jm 

Ax,  Ay  of  lendet  arrays 

±10^ 

Rotation  of  len^et  arrays 

±10 

Translation  of  receiver  array 

±50^m 

Ax,  Ay  of  tenslet  arrays 

±1pm 

Rotation  of  device  pbnes 

±0.250 

Tilt  of  modulator  array 

±0.250 

Table  1  -  Alignment  tolerances. 


Figure  6  -  Optomechanical  layout  of  baseplate. 
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Abstract 

An  active  alignment  system  is  demonstrated.  The  error  between  the  actual  and  desired  spot 
location  on  a  quadrant  detector  is  used  to  compute  the  angular  displacement  required  for  two 
Risky  Beam  Steer ers  to  centre  the  spot.  Theoretical  design  and  experimental  results  are  given. 

Introduction 

Free-space  optical  interconnects  hold  the  promise  of  alleviating  communication  bottlenecks 
in  future  connection-intensive  electronic  systems.  A  separable  interconnect  offering  the 
massive  connectivity  of  free-space  optics  will  greatly  increase  the  data  throughput  between 
multi-chip-modules  and/or  printed  circuit  boards  (PCBs)  on  an  optical  backplane. 

Alignment  is  a  key  issue  facing  free-space  optical  systems.  For  a  system  to  be  of  practical 
use,  it  must  be  operate  continuously  for  long  periods  of  time  in  harsh  industrial  conditions. 

At  least  two  approaches  exist  to  overcome  alignment  drift: 

1)  Design  an  extremely  rigid  system  which  will  not  drift  over  time.  This  can  be 
performed  by  removing  as  many  degrees  of  freedom  as  possible[l]  and  by 
prealigning. 

2)  Implement  active  alignment,  a  process  in  which  system  parameters  such  as 
throughput  or  error  in  spot  position  are  monitored  and  fed  back  to  a  controller 
which  realigns  the  system  by  altering  the  state  of  the  optics.  Such  feedback  loops 
exist  in  CD  players  [2]. 

For  an  optical  backplane  system  with  manually  removable  PCBs,  active  alignment  will 
likely  be  a  necessity.  This  paper  describes  an  active  alignment  demonstrator  for  a  free-space 
optical  link.  The  basic  control  loop  for  the  system  is  shown  in  Figure  1  and  is  explained 


Figure  1:  Basic  control  loop  of  active  alignment  demonstrator 

Risley  Beam  Steerers  (RBSs)  were  chosen  as  the  optical  components  to  move  in  an  active 
alignment  experiment  because  they  have  already  been  used  for  x-y  alignment  in  free  space- 
optical  systems  [1]  and  have  proven  to  be  a  simple  and  cheap  way  of  aligning  free-space 
interconnects.  Furthermore,  components  requiring  rotational,  as  opposed  to  lateral, 
displacement  can  easily  fit  into  a  slotted  plate,  barrel  or  other  integrated  optomechanical  setup. 
In  addition,  angular  motion  is  easier  to  control  than  rectilinear  motion  when  compensating  for 
vibrational  effects  [3]. 

Definitions  and  assumptions 

Let 

1)  An  RBS  be  a  nearly  cylindrical  optical  component  with  a  wedge  in  it,  and  whose 

purpose  is  to  steer  beams  of  light  by  imparting  an  angular  displacement  to  them,  as 
shown  in  Fig. 2  and  3. 

2)  The  two  RBSs  in  the  system  be  named  RBS  A  and  RBS  B. 

3)  bA  and  pp  be  the  wedge  angle  of  RBS  A  and  RBS  B  respectively. 
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4)  Oa  Ob  be  respectively  the  angular  rotation  of  RBS  A  and  B  about  the  z-axis 

(Fig. 3). 

5)  (x},y])  be  the  Cartesian  representation  of  an  arbitrary  displacement  imparted  to  a 

spot  by  the  RBSs,  and  let  di  be  the  magnitude  of  the  displacement  (X2,y}): 


6)  0;  be  a  modified  form  of  arctan  (yj/xj),  such  that  0  <  0;  <  27C. 

7)  {xu,yu)  be  the  position  of  the  spot  centre  on  the  quadrant  detector  (QD)  had  it  not 

been  moved  by  die  RBS  pair. 


Effect  of  two  RBSs  in  series  on  the  trajectory  of  a  beam 

From  first  order  theory,  the  angular  deviation,  «5,  imparted  by  an  RBS  on  a  beam  is 
5={n-l)p,  where  n  is  the  index  of  refraction  of  the  RBS  (Fig.  2).  As  such,  at  the  focal  plane  of 
a  lens  with  focal  length/,  the  lateral  displacement,  r,  to  the  spot  imparted  by  the  RBS  is: 


=  /  tan  5^=/  tan((n  - 1)^3^ )  (2a) 

rs  ^/tan^g  =  /tan((«-l)/3g)  (2b) 


Rotating  the  wedge  about  the  z  axis  will  cause  the  spot  in  the  focal  plane  to  move  along  the 
circumference  of  a  circle  of  radius  r  in  the  x-y  plane  at  the  lens'  focal  plane.  If  RBS  A  and  B 
are  laid  out  in  series  along  the  beam  path,  the  angular  displacements  imparted  by  the  two 
RBSs  will  add.  When  RBS  A  and  B  are  rotated  about  the  z-axis,  the  spot  will  move  on  the 
periphery  of  a  first  circle  of  radius  whose  centre  is  located  on  the  periphery  of  a  second 
circle  of  radius  r^,  as  shown  in  Figure  3. 

Given  the  above  explanation  and  definitions,  it  is  possible  to  outline  a  centering  algorithm. 
The  following  is  an  algorithm  for  determining  the  angular  rotational  displacement,  {Oa,  Ob),  of 
the  two  RBSs  required  to  centre  on  the  QD  a  spot  which  is  currently  misaligned  by  {AxeAye)'- 

1)  Determine  (x«,y«),  where  {O' a,  O'b)  are  the  previous  values  of  {Oa,  Ob). 

-  ('■a  cos  e;, + Kb  cos  e's) 
yu  =  ^ye  -  (m  sin  e'A  +  rB sin  B'b) 

2)  Determine  {Oa,  Ob)  necessary  for  the  RBSs  to  impart  a  displacement  of  {-Xu,-yu). 

The  Cartesian  to  Risley  (CtoR  transform)  maps  {-Xu,-yu)  to  {Oa,  Ob): 


(4) 

$l  =  arctan(~>’y  /  -x^)  where  0  <  9]  <  2n. 

(5) 

(  1  2 

0^  -  arccos  ^  ^ 

-2rAd^  J 

(6a) 
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Oh  =  arccos 


(2  2  .2^ 
'  [aziazA. 

-Irgd^ 


The  new  value  of  (6a,  Ob)  is  thus: 


eA=6l+&a 

9s  =  0i-  Oh 


(6b) 

(7a) 

(7b) 


(6 A,  6b)  is  now  the  desired  input  pair  of  angles  to  be  fed  to  the  stepper  motor  system 
controlling  the  RBSs.  For  the  cases  in  which  more  than  one  solution  is  possible  (equal  wedge 
angles),  the  algorithm  chooses  the  solution  consistent  with  the  geometry  of  Figure  3. 

It  is  interesting  to  note  the  restrictions  onJ;:  \)di<  and  2)  di  >  I  I. 

The  first  condition  simply  implies  that  the  total  misalignment  distance,  dj,  must  be  less 
than  the  total  radius  of  action  of  the  RBSs.  The  second  condition  is  more  problematic  and 
implies  that  there  is  a  region,  a  'blind  area',  to  which  the  RBSs  cannot  centre  a  spot. 

The  blind  area  does  not  mean  that  the  mismatched  RBSs  will  never  be  able  to  centre  a 
spot.  A  blind  area  means  that  the  RBSs  will  never  be  able  to  impart  a  net  displacement  of  zero 
p.m  to  the  spot.  The  corollary  of  this  statement  is:  only  if  the  spot  is  more  than  away 
from  the  centre  before  RBS  insertion  can  it  be  centered  after  RBS  insertion.  In  the  future, 
matched  wedges  obtained  by  cutting  one  RBS  in  two  along  its  diameter  will  be  investigated. 


Figure  3  a):  Effect  on  beam  of  two  RBSs  in  series  .  Z  axis  is  coaxial  with  RBSs.  b)  locus  of 
spot  movement  on  the  focal  plane,  looking  down  the  z-axis.  c)  expansion  ofb)  with 
mathematical  constructions  used  in  derivation  of  transforms. 

Initial  conditions 

Step  1  of  the  algorithm  assumes  that  the  previous  values  of  (6>^,  6b)  are  known.  However, 
the  algorithm  is  only  valid  for  the  "steady-state":  in  order  to  begin,  an  initialization  routine 
('hunting')  must  determine  the  initial  angular  displacement  of  the  RBSs.  Given  an  unknown 
initial  angular  displacement  (6'a,6'b)  of  the  RBSs  and  an  arbitrary  (Ax'eAy'e),  hunting 
involves  rotating  both  RBSs  until  the  spot's  Axe  coordinate  is  at  a  maximum  (maximum 
deflection  to  the  right).  By  definition,  this  angular  position  corresponds  to  (6>a,05)=(O,O). 

Demonstrator  and  experimental  results 

The  physical  values  used  in  the  demonstrator  were  as  follows: /=  100  mm,  ^a  ~  0.208° 
and  Pb  =  0.278°.  The  beam  was  defocused  on  the  QD  such  that  the  spot  size  was  3w  ~  585 
fim  (99%  power)  in  order  to  obtain  a  greater  range  for  error  detection.  The  lens  and  a  custom 
holder  for  the  RBSs  were  installed  on  a  Spindler  &  Hoyer  microbench  setup  built  around 
stepper  motors  and  a  gear  reduction  system  for  rotating  the  RBSs.  The  QD  was  a  Hamamatsu 
S4349.  With  a  10  volt  applied  reverse  bias  voltage,  the  generated  photocurrents  produced 
voltage  swings  across  330  k^2  resistors  of  0-10  V  which  were  sampled  by  a  National 
Instruments  Lab-NB™  I/O  board  on  a  Macintosh;  the  algorithm  was  encoded  on  Lab  View™ 
running  on  the  Macintosh.  The  voltage  pulses  for  turning  the  motors  were  sent  out  by  the  Lab- 
NB™.  A  UCN  5804  chip  between  the  board  and  motors  provided  power  gain  for  the  motors. 
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Fig.  5a  shows  a  typical  run.  The  spot  centre  started  9t  position  1,  {AxeAje)-  (-127,92) 
fim.  After  one  iteration,  the  spot  ended  at  position  2,  {Axe,Aye)=  (-17,12)  [im.  The  error  was 
likely  due  to  laser  fluctuations,  mechanical  backlash,  and  measurement  errors  occurring  when 
the  spot  is  far  from  the  QD  centre.  A  second  iteration  brought  the  spot  centre  to  position  3, 
{Axe,Aye)=  (-3.8,-6)  jim.  Fig  5b  and  5c  show  a  spot  before  and  after  a  typical  steady-state 
centering  run.  Table  1  outlines  the  results  obtained  from  eight  consecutive  centerings 
performed  by  the  demonstrator.  In  each  case,  the  spot  was  aligned  to  within  ~20  fim  in  one 
iteration  and  to  within  10  jim  in  two  iterations.  The  measurement  accuracy  of  the  system  near 
the  origin  was  ±  5  |im  for  each  coordinate  (Axe,Aye),  and  mechanical  backlash  caused  an 
uncertainty  of  ±  2  pm  in  the  spot  position. 
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Figure  5a):  centering  operation,  b)  and  c):  spot  before  and  after  a  centering  run.  Initially,  the 
spot  was  manually  misahgned  to  position  (-213,-7)  fim.  After  one  iteration,  the  spot  was  at 
(-8,20)  |im.  (Image  is  inverted  and  reflected  by  imaging  system) 


Initial  (Axe,Aye) 

Final  (Axe,Aye) 

Final  distance  to  centre 

iterations 

(-127,92) 

(-3.8.-6) 

7.1 

2 

(109,5) 

109.1 

(-0.8,-2) 

2.15 

2 

(-53,10) 

504 

(1.1, -6.1) 

6.2 

1 

119.4 

(7.2,-1.4) 

7.3 

2 

■aifdSisH 

147 

(OO) 

5.8 

2 

(-85,19) 

87.1 

053) 

5.4 

1 

(90,103) 

136 

9.8 

1 

(89,76) 

117 

(-1.3, 3.9) 

4.1 

3 

Table  I:  Result  of  consecutive  centering  operations.  All  distances  and  positions  in  microns. 


Conclusion  and  future  directions 

An  active  alignment  system  using  Risley  Beam  Steerers  to  centre  a  spot  on  a  quadrant 
detector  was  successfully  designed,  built  and  implemented.  Future  research  will  involve 
scaling  the  system  to  arrays  of  interconnects,  for  example  by  using  higher  orders  of  spot 
arrays  to  illuminate  quadrant  detectors  on  the  periphery  of  a  device  array. 
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Abstract.  A  nematic  liquid  crystal  three  terminal  device  for  optical  beam  steering  is 
demonstrated  using  a  thin  film  resistor  network  on  the  substrate  layer  forming  a  near 
continuous  index  perturbation  adaptive  lens.  A  12  cm  minimum  focal  length  is 
demonstrated  for  a  1  mm  square  device. 

1.  Introduction 

The  use  of  nematic  liquid  crystals  (NLCs)  for  making  variable  focal  length  lenses  was  first 
described  as  early  as  1979,  where  S.  Sato  showed  how  NLC  cells  shaped  like  a  plano-convex 
lens  or  a  plano-concave  lens  could  be  electrically  controlled  to  form  adaptive  lenses  [1],  In 
1984,  S.  T.  Kowel  et.  al  used  a  simple  parallel  geometry  electrode  structure  (each  electrode  has 
a  different  applied  voltage)  in  a  uniform  thickness  NLC  cell  to  demonstrate  the  optical  focusing 
effect  [2-3].  Later,  in  1988,  the  same  group  showed  optical  beam  translation  with  a  similar 
electrode  array  geometry  device,  and  described  the  effects  of  fringing  fields  of  the  electrodes  on 
the  NLC  index  perturbation  [4].  Although,  using  a  grid  of  independently  driven  parallel 
electrodes  does  produce  an  adequate  refractive  index  perturbation  across  the  NLC  cell  area  to 
cause  optical  beam  focusing,  a  useful  amount  of  light  is  lost  via  the  non-ideal  (ideally, 
quadratic  index  pertubation  with  distance  is  required)  near  step-wise  index  variation  caused  by 
the  discrete  nature  of  the  electrodes  and  their  applied  voltages.  Therefore,  it  would  be  beneficial 
if  the  index  variation  induced  by  the  electrode  structure  could  made  more  gradual  and  smooth, 
thus  improving  the  optical  efficiency  of  the  NLC  lens.  Also  note  that  a  useful  NLC  lens  with 
this  parallel  electrode  geometry  would  typically  require  >  100  electrodes,  implying  a  100 
independent  electronic  drivers.  Thus  the  whole  purpose  of  using  a  compact,  lightweight,  low 
cost  programmable  lens  would  be  defeated  by  the  cost,  weight,  and  power  consumption  of  the 
electronics  used  to  control  the  lens.  Recently,  we  have  described  how  the  index  perturbation 
between  the  electrode  pairs  can  be  made  smooth,  and  how  the  need  for  independent  electronic 
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drivers  can  be  eliminated  thus  making  a  simple  3  terminal  control  device  with  one  electronic 
driver  [5-6].  This  paper  highlights  these  novel  NLC  device  concepts. 

2.  Device  Design  and  Experiment 


V»lteg*  TertntaaJ  c<MiBectedlo 
Termiiral  etech-Ml* 

Fig.l  The  novel  thin-film  resistor  network  biased  nematic  liquid  crystal  leas  design  (top  view). 


Fig.  1  shows  the  top  view  of  the  novel  thin-film  resistor  biased  NLC  cylindrical  lens.  An 
indium  tin  oxide  (ITO)  thin-film  resistive  network  of  specific  resistor  values  generates  a 
quadratically  varying  voltage  gradient  across  the  symmetrical  (about  die  lens  center)  parallel 
electrode  structure,  with  the  resistors  in  series  with  the  molybdenum  metal  electrodes.  By 
applying  a  constant  amplitude  ac  (e.g.,  1  KHz  square  wave)  signal  to  the  top  and  bottom 
terminals  (or  electrodes)  of  the  NLC  device  (see  Fig.l),  with  the  center  electrode  biased  near 
the  NLC  threshold  value  (near  1  V)  for  molecular  activation,  a  piece-wise  quadraticdly  varying 
index  perturbation  is  generated  by  the  electrode  structure  using  a  single  driver  or  voltage  level. 
By  varying  the  voltage  level,  the  focal  length  of  this  lens  can  be  easily  changed.  Because  there 
is  a  gap  between  the  parallel  electrodes,  the  NLC  molecules  in  this  region  do  not  cause  the 
required  quadratic  index  pertubation  behavior.  We  have  somewhat  alleviated  this  problem  by 
adding  a  thin-film  of  very  high  resistance  (e.g.,  10  Mega-ohms/square  mm)  amorphous  silicon 
to  the  lens  area  before  depositing  the  much  lower  resistance  (e.g.,  0.4  ohras/square  mm) 
parallel  electrode  structure.  Thus,  the  amorphous  silicon  acts  as  a  resistor  between  the 
conductor  electrodes,  generaung  a  linear  voltage  (index)  gradient  between  the  electrode  gaps. 
This  behavior  results  in  a  smoother  approximation  to  the  desired  continuous  quadratic  index 
perturbation  required  for  the  lens  effect,  thus  improving  the  lens  efficiency  by  reducing  die 
diffracted  light  that  is  lost  when  the  lens  approximation  occurs  in  discrete  steps. 

Fig. 2  shows  the  exf^rimental  results  from  our  1  mm  ^uare  NLC  device.  The  figures 
on  the  left  show  a  color-coded  (or  grey-scale)  approximately  quadratic  left  to  right 
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(symmetrical  at  center  of  lens)  optical  phase/index  perturbation  shown  by  the  device  for  two 
different  applied  voltage  levels  (or  focal  lengths)  when  viewed  with  white  light  using  the  device 
between  crossed  polarizers.  The  top  photo  on  the  right  shows  collimated  633  nm  He-Ne  light 
passing  through  the  electrically  tumed-off  NLC  device. 


Fig.2  shows  the  demonstration  of  the  1  mm  square  NLC  device  fabricated  at  GE-CRD. 

The  diffraction  effects  observed  are  due  to  the  light  opaque  square  perimeter  metal 
mask  surrounding  the  active  area  of  the  device.  The  middle  right  photo  shows  the  focusing 
effect  when  the  device  is  turned  on.  Both  pictures  are  taken  at  the  NLC  minimum  focal  length 
of  12  cm  using  a  CCD.  The  focal  length  was  variable  from  this  minimum  focal  length  to 
infinity  by  changing  the  amplitude  of  the  single  driver  from  4  V  to  1  V  (the  NLC  threshold). 
The  bottom  photo  on  the  right  shows  the  focussed  main  beam  and  its  sidelobes  in  temporal 
form  as  the  output  CCD  video  signal  observed  on  an  oscilloscope.  Note  that  the  sidelobes  are 
much  lower  than  the  main  beam,  a  desired  feature  of  any  good  lens.  In  order  to  observe  these 
relatively  low  sidelobes,  we  increase  the  laser  beam  power  incident  on  the  device  which  results 
in  the  main  lobe  saturating  the  CCD  but  producing  a  much  brighter  image  of  the  previously 
low-level  sidelobes  (see  middle  photo). 

3.  Applications 

Key  applications  for  this  adaptive  NLC  device  include  optical  beam  spoiling  for  laser 
communications  applications,  misalignment  correction  in  optical  recording/reading  systems 
and  beam  deflection/alignment  applications  in  optical  computing  and  interconnection 
architectures.  Figures  3  and  4  show  two  such  programmable  lens  applications.  It  is  important 
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Abstract.  A  monolithic  integration  of  fiber  holder  arrays  and  microlens  arrays 
well  aligned  to  each  other  can  be  achieved  by  deep  proton  irradiation. 


1.  Introduction 

For  almost  all  fiber  applications  the  combination  of  fibers  and  microlenses  is  of  evident 
relevance.  A  microlens  can  interact  with  a  fiber  by  coupling  light  from  a  source  into  a  fiber 
and  by  collimating  light  coming  out  of  a  fiber.  In  both  cases  the  longitudial  and  the  lateral 
displacement  between  fiber  and  lens  has  to  meet  strictly  some  tolerances  especially  if  single 
mode  fibers  are  required.  Also  of  interest  is  the  alignment  between  arrays  of  fibers  and 
arrays  of  microlenses.  Problems  arise  from  the  fact  that  convential  methods  of  fabricating 
a  fiber  holder  and  a  microlens  are  farely  distinct.  Since  several  years  many  different 
fabrication  methods  for  microlenses  have  been  published  [1,2,3].  Due  to  the  lithographic 
design  methods  it  is  easy  to  fabricate  arrays  of  microlenses  instead  of  only  a  single 
microlens.  This  is  not  the  case  for  fiber  holders.  If  several  optical  fibers  have  to  be  aligned 
one  line  of  fibers  can  be  arranged  in  V-grooves  [4,5,6].  A  more  recent  publication  tries  to 
integrate  a  4  by  4  array  of  single  mode  optical  fibers  in  one  substrate  using  excimer  laser 
ablation  [7].  But  even  here  there  is  still  a  need  for  a  microlens  array  well  adapted  to  the 
spacing  and  the  tolerances  of  the  fiber  holder  array, 

A  fabrication  method  for  both  fiber  holder  arrays  and  microlens  arrays  monolithically 
integrated  on  one  substrate  is  the  technique  of  deep  proton  irradiation  [8,9] .  This  is  a  tool 
for  the  fabrication  of  microlens  arrays  but  as  well  for  the  fabrication  of  hole  arrays  which 
are  suitable  for  fiber  holding.  A  combination  of  these  two  kinds  of  arrays  allows  an  assembly 
of  fibers  with  lenses  on  top.  Tolerances  in  the  fiber  diameter  can  be  reduced  by  the 
possibility  of  selfcentering  of  the  fibers  in  the  holder. 


2.  Fabrication  Method 

The  fabrication  method  is  based  on  the  fact  that  the  irradiation  of  PMMA  with  protons 
changes  the  molecular  weight.  Thus  those  areas  irradiated  more  than  a  critical  dose  can  be 
dissolved  in  a  special  developer  [8] .  Furthermore  the  reduction  of  molecular  weight  increases 
the  swelling  potential  of  the  material  which  means  that  monomer  vapor  can  diffuse  only  in 
the  domains  with  sufficiently  low  molecular  weight  and  produces  a  significant  swelling  [9] . 

For  the  fabrication  of  fiber  holder  arrays  and  microlens  arrays  a  PMMA  substrate  is 
irradiated  through  a  mask  with  an  array  of  circular  apertures  corresponding  to  the  size  and 
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Fig.  1  Dose  distribution  in  PMMA  under  an  ideal  edge 


pitch  of  the  fiber.  Important  in  this  context  is  the  distribution  of  molecular  weight  caused  by 
the  proton  irradiation.  The  simulation  [10]  in  fig.  1  shows  lines  of  equal  dose  deposition  in 
a  PMMA  sample  of  500  ^m  thickness  (which  is  in  this  range  equivalent  to  the  distribution 
of  molecular  weight).  Dose  deposition  is  determined  by  the  energy  dependance  of  the 
absorption  coefficient  and  by  scattering.  An  ideal  edge  is  assumed  irradiated  homogeneously 
with  N  *  10'^  particles  per  cm^  with  a  proton  energy  of  7  MeV. 

The  graph  shows  that  at  the  bottom  a  significant  dose  deposition  can  be  obtained  in  the 
geometrical  shadow  region  of  an  edge  which  is  caused  by  the  lateral  straggling  of  the 
protons.  Furdiermore  it  can  be  seen  that  the  dose  increases  downwards  which  means  that  the 
maximum  dose  is  deposited  at  the  bottom  of  the  structure.  The  choice  of  the  dose  parameter 
N  it  determines  the  level  where  in  the  substrate  the  development  processes  stops.  In  fig.  1 
three  areas  are  labelled  which  are  of  interest  with  respect  to  applications  for  fiber  optics  [11]. 
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For  the  fabrication  of  fiber  holders  the  dose  is  choosen  that  way  that  the  developer  can 
dissolve  domains  1  and  2.  This  results  in  a  fan-shaped  hole  through  the  substrate.  A  fiber 
can  be  plugged  into  this  hole  and  afterwards  the  remaining  domain  3  can  be  swollen  in  a 
styrene  vapor  athmosphere  thus  centering  the  fiber.  Fig.  2  shows  a  photograph  of  a  4x5 
single  mode  fiber  array  fabricated  by  this  method.  The  fiber  diameter  is  125  /xm. 

For  the  monolithic  integration  of  fiber  holders  and  lenses  the  dose  given  to  the 
substrate  has  to  be  choosen  less  than  before  so  that  only  domain  1  is  developed.  The 
resulting  structure  now  is  a  blind  hole  where  a  fiber  can  be  plugged  in.  A  subsequent 
diffusion  of  styrene  vapor  will  swell  domain  2  and  3.  The  swelling  of  domain  2  forms  a 
surface  lens  on  top  of  the  substrate  well  aligned  to  the  fiber  and  as  above  domain  3  is 
swollen  for  centering  and  fixing  the  fiber  in  the  blind  hole.  Fig.  3  shows  the  depth  of 
domain  1  as  a  function  of  the  dose  in  a  PMMA  substrate  with  a  thickness  of  500  fim. 

Fig.  4  is  a  photograph  of  a  4x5  array  of  blind  holes  irradiated  with  15.5*10*^  lons/cm^. 
The  depth  of  the  holes  is  160  ^m.  The  hole  is  slightly  fan  shaped  with  a  bottom  diameter 
of  140  /xm  and  a  top  diameter  of  130  /xm. 

Fig.  5  is  side  view  of  a  plugged  fiber  with  a  monolithically  integrated  surface 
microlens  on  top.  A  fiber  with  a  diameter  of  140  /xm  is  plugged  into  a  160  /xm  deep  blind 
hole.  The  substrate  thickness  is  500  /xm. 


3.  Summary 

The  method  of  deep  proton  irradiation  allows  the  fabrication  of  fan  shaped  fiber 
holders  which  can  compensate  small  tolerances  in  the  fiber  diameters  by  swelling  the  holder. 
Arrays  of  fiber  holders  and  microlenses  can  be  fabricated  together  in  the  same  PMMA 
substrate.  This  kind  of  monolithic  integration  ensures  that  the  microlenses  are  exactly  on  top 
of  their  corresponding  fiber  holders,  therefore  no  further  alignment  is  necessary. 


Number  of  ions  [10  ^^/cm®] 


Fig.  4  4x5  array  of  blind  holes  in  PMMA; 
Fig-  3  Measured  depth  of  development  as  a  depth:  160  /xm,  diameter:  140  /xm  (bottom),  130 

function  of  the  dose  /xm  (top) 
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Fig.  5  Side  view  of  a  monolithically  integrated  fiber  lens  combination;  parameters  are  as  in  fig.  4 
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Abstract.  We  address  the  needs  to  developp  new  design  algorithms  and  optimisation  techniques 
for  Computer  Generated  Holograms  (CGH)  that  have  to  work  efficiently  within  an 
optoelectronic  packaged  system.  First,  we  develop  practical  solutions  to  increase  sensibly  the 
performances  of  a  modified  Gerchberg-Saxton  (G-S)  iterative  algorithm.  We  then  analyse  the 
various  effects  of  the  different  packaging  tolerance  constraints  on  the  CGH  performances,  and 
finally  propose  some  iterative  optimisation  techniques  to  overcome  these  drawbacks. 


1.  Introduction. 

The  design  and  optimisation  process  of  Computer  Generated  Holograms  (CGHs)  for  free- 
space  optical  interconnections  (FSIO)  has  been  widely  investigated  [1],[2],[3].  Recently  [4], 
we  have  demonstrated  the  potentials  of  a  modified  G-S  algorithm,  and  compared  its 
performances  to  several  analytical  generation  methods  to  design  Fresnel  CGHs  for  FSOI.  On 
the  other  hand,  considerable  work  has  been  done  on  packaging  and  tolerance  analysis  in  opto¬ 
electronic  packaged  systems  [5], [6].  In  this  paper  we  investigate  the  possibility  to  design  and 
optimise  Fresnel  CGHs  that  accomodate  some  of  the  tolerances  constraints  that  occur  in 
packaged  systems. 


2.  The  modified  G-S  iterative  algorithm. 

The  G-S  algorithm  is  a  powerful  and  fast  algorithm  to  design  and  optimise  Fresnel  CGHs  for 
FSIO,  when  multiple  fan-out  and  focussing  properties  are  desired  at  the  same  time  in  the 
same  element  [4],  However,  much  improvement  can  still  be  done  in  the  uniformity  of  the 
reconstruction  by  implementing  different  solutions  into  the  basic  G-S  loop. 
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We  have  chosen  a  16  phase  levels  element,  250)im  square,  composed  of  128  by  128  kinoform 
encoded  square  cells,  a  focal  length  f=800|im  and  a  fan-out  of  1 1  reproducing  the  letters  'OC. 
For  the  source,  we  use  a  X=830nm  laser  diode  with  an  elliptic  beam  waist. 
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Fig.l:  Diffraction  efficiency  and  uniformity  of  reconstruction  at  focal  plane,  when  applying  several  error 
reinjection  methods  into  the  G-S  algorithm. 

The  results  in  diffraction  efficiency  (after  approx.  10  loops)  are  not  affected  by  the  changes  in 
the  algorithm,  but  the  uniformity  has  been  much  increased,  especially  when  the  error  to  be 
reinjected  is  scaled  by  a  factor  k  which  depends  on  the  convergence  rate  (see  Fig.l).  Also,  tt 
shows  that  the  reinjection  in  the  reconstruction  plane  is  not  necessary,  whereas  the  reinjection 
in  the  CGH  plane  makes  all  the  difference. 


3.  Effects  of  packaging  tolerances  on  the  CGH  performances 

For  a  CGH  to  work  efficiently  within  a  packaged  system,  it  has  to  meet  the  standards  of 
tolerances  of  the  system  in  which  it  has  been  packaged.  These  include: 

-  Fabrication  tolerances  in  both  Diffractive  Optics  Plane  (DOP)  and  Processing  Plane  (PP). 
systematic  errors  (etch  depth  errors,  lateral  misalignements),  random  errors  (waviness  and 
roughness  of  substrate)  and  encoding  errors  (square  cells  for  kinoform  encoding). 

-  Alignment  tolerances  between  the  DOP  and  PP  (use  of  spacers,  flip-chip  bonding,...). 

-  Operating  tolerances  (spectral  bandwidth,  operating  mode  and  beam  waist  of  source). 

-  Local  thermal  expansion  coefficients  of  DOP,  spacers,  PP,  and  thermal  effects  on  source 
(spectral  shift). 

-  Architecture  tolerances  (placement  of  processing  elements),  dimension  of  active  surface  on 
detectors,  position  of  dots  within  the  PP. 

We  have  numerically  simulated  the  effects  of  these  tolerances  by  considering  the  diffraction 
efficiency,  uniformity,  shift  and  defocalisation  of  the  spots  for  four  different  packaging 
tolerance  constraints. 

Fig. 2  reports  the  results  for  several  packaging  tolerance  constraints. 


Amplitude  (a.u.)  Amplitude  (a 
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Fig.2:  Effects  of  spectral  shift  of  source,  uniform  etch  depth  errors,  longitudinal  misplacement  and  vvaviness 
of  substrate  on  CGH  performances. 

4.  Fresnel  CGH  designs  algorithms  including  packaging  tolerance  constraints 

The  packaging  tolerances  investigated  in  3  have  in  common  the  variation  of  the  wavelength  X 
(spectral  shift  and  indirectly  etch  depth  errors  and  waviness),  and/or  focal  length  / 
(longitudinal  misalignement).  Also,  it  has  been  shown  that  the  thermal  expansion  coefficient 
relates  the  change  in  /  to  the  change  in  temperature  [7],  Besides,  the  kernel  functions  we  use 
do  not  see  the  variation  of  ^  or/ distinctly,  but  only  the  variation  of  the  factor  X*f  (e.g.  fast 
Fresnel  transform).  So  most  of  the  tolerances  described  here  can  then  be  related  to 
longitudinal  misplacement  (e.g.  variation  of /).  If  we  can  design  a  CGH  with  extended  depth 
of  focus,  we  can  accomodate  these  various  tolerances.  We  have  therefore  developed  3 
different  algorithms  based  on  the  G-S  loop. 

The  first  one  bounces  back  and  forth  between  the  CGH,  the  focal  plane  and  two  planes  each 
side  of  the  focal  plane,  where  the  constraints  are  the  waists  of  the  defocussed  beams 
decreased  by  a  factor  a  (thus  increasing  the  depth  of  focus). 
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The  second  one  is  based  on  a  G-S  optimised  Fourier  CGH  which  performs  the  reconstruction 
in  the  far  field:  this  reconstruction  is  brought  into  the  near  field  by  modulating  the  Fourier 
CGFI  by  the  following  phase  profile. 

7t. 

<^>  =  -r - 


fo  +  L 


depth 


y 

cos  arctg{^) 

V  ^  / 


(a  spherical  lens  with  a  focal  length  varying  to  /q  + 

The  third  method  uses  an  extended  kernel  function  instead  of  the  Fresnel  transform  in  the  G-S 
loop  (both  quadratic  phases  in  the  expression  of  the  Fresnel  transform  are  replaced  by  the 
above  analytical  expression).  To  compare  their  performances  to  the  straight  forward  Fresnel 
G-S  algorithm,  we  consider  the  amount  of  light  falling  into  11,  lO.O^im  square  detectors  in 
the  PP  plane,  and  the  beam  waists  and  lateral  shifts  of  the  1 1  spots,  when  the  longitudinal 
displacement  varies  from  -10%  to  +10%.  Fig.3  reports  these  variations. 


_  Basic  G~S 

.  Optimised  G-S  on  4  planes 

_  _  modulation 


Fig.3:  Variation  of  efificicncy,  beam  waist  and  lateral  shift,  for  the  different  algorithms,  when  longitudinal 
misalignemenls  occur. 

5,  Conclusion, 

We  have  discussed  the  needs  to  design  and  optimise  Fresnel  CGHs  for  free-space  optical 
interconnections  by  considering  closely  several  packaging  tolerance  constraints.  We  analysed 
the  effects  of  these  tolerances  on  the  CGH  performances,  and  proposed  different  algorithms 
to  design  multiple-fan-out  Fresnel  CGHs  that  will  accomodate  these  tolerances. 
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Abstract:  The  implementation  of  a  general  design  concept  for  the  design  of  diffractive  phase 
elements,  with  a  trapezoidal  pulse  structure,  for  array  illumination  is  described.  Results  are 
presented  for  on-axis  two-dimensional  array  illuminators  designed  using  this  method.  The  final 
designs  are  within  5%  of  the  calculated  upper  bound  of  the  diffraction  efficiency  for  the  given 
signal  and  signal  non-uniformity  is  below  1%. 


1.  Introduction 

Design  theory  for  diffractive  optical  elements  (DOEs)  is  a  rapidly  expanding  area  of 
research,  especially  within  the  area  concerned  with  the  design  of  DOEs  that  operate  within 
the  paraxial  domain.  In  particular  the  needs  of  digital  optics  and  material  processing  lead  to  a 
requirement  for  diffractive  phase  elements  (DPE)  that  split  an  incoming  light  beam  into  an 
array  of  prescribed  light  spots,  i.e,,  array  illuminators. 

The  design  problem  for  array  illuminators  is  expressed  in  general  terms  as  a  desire  to 
find  a  surface-relief  phase  profile  6(m)  (where  u  =  {u,v))  that  has  a  phase  transmittance  or 
reflectance  H{u)  =  exp[/0  (w)]  which  satisfies  (within  the  paraxial  domain) 

I ‘S:{T)p=|  J[//(m)]P  xgW,  (1) 

where  J  =  (A:,y),  t?  is  the  Fourier  transform  operator  and  5(x)  is  the  complex  amplitude 
signal  of  prescribed  intensity  defined  within  the  signal  window  W .  For  the  case  of  a  general 
array  illuminator  the  prescribed  signal  is  given  by 

5(T)  =  {x  - mAx)5  (y  -  nAy),  (2) 

m,n 

where  5^  „  are  the  desired  complex  amplitudes  of  the  signal,  5  (x)  is  the  dirac-delta  function 
and  Ax  and  Ay  are  the  spacing  of  the  orders  in  the  signal  plane. 

Typically  the  diffraction  efficiency  of  the  signal  is  also  specified  such  that 

'n(z)sn.„i„.  (3) 

where  Ti(Z)is  the  diffraction  efficiency  of  the  final  DPE  of  Z  phase  levels  and  Ti„i„  is  the 
minimum  diffraction  efficiency  required  for  the  application.  The  dependence  of  the 
diffraction  efficiency  upon  Z  is  a  product  of  the  fabrication  techniques  commonly  employed 
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to  realise  DPEs.  The  most  widely  used  is  multi-mask  microlithography.  This  requires  the 
phase  profile  0(m)  to  be  quantised  to  a  discrete  number  of  phase  levels  (ideally  2^,  where  N 
is  an  integer).  Thus  diffraction  efficiency  becomes  a  function  of  Z.  Infoimation  about  the 
minimum  value  of  Z  necessary  to  satisfy  Eq.  (3)  would  facilitate  the  fabrication  process  and 
reduce  intrinsic  fabrication  errors.  It  is  clear  that  the  greater  the  number  of  masking  steps,  the 
larger  the  error  becomes  [1]. 

A  design  concept  embedded  within  the  theoretical  framework  of  diffraction  elements 
that  solves  the  design  problem  above  has  been  described  in  Ref.  [2].  Given  here  is  an 
implementation  of  that  design  concept  for  multilevel  array  illuminators.  Particular  emphasis 
is  made  upon  the  advantages  of  a  parameterisation  that  leads  to  trapezoidal  phase  pulses. 


2.  Optimisation  of  the  phase  of  ^(r). 

Equation  (1)  is  only  concerned  with  the  magnitude  of  the  complex  amplitude  signal  iSCx); 
this  leaves  the  phase  of  S(x)  as  a  free  parameter.  This  can  be  used  to  provide  information 
about  the  upper  bound  of  the  diffraction  efficiency,  (Z),  of  quantised  DPEs  and  suggest  an 
initial  starting  point  for  the  DPE  design  [2].  The  diffraction  efficiency  upper  bound  is  given 
by  [3] 

<|//(M)|cos[Ae(i7,Z)]>^ 

where  A0(w,Z)  is  the  minimum  phase  change  needed  to  map  0(u)  onto  an  allowed  phase 
level  and  <  >  indicates  an  averaged  value.  By  optimising  the  phase  of  S{x)  we  minimise 
A0(lZ,Z)  and  Ti,(z)  approaches  a  maximum.  In  this  way  we  gather  information  that  relates 
diffraction  efficiency  to  phase  levels  for  a  given  signal  and  can  ensure  that  Eq.  (3)  will  hold 
before  continuing  with  the  DPE  design. 

Once  the  optimised  complex  signal  5'(j)  is  found  (all  optimised  quantities  will  be 
indicated  with  a  dash)  a  projection  of  the  Fourier  spectrum  of  S'{x)  is  made  onto  the  allowed 
phase  levels, 

H'{u)  =  exp[i(0  '(u)  +  A0  (m,Z))].  (5) 

This  provides  a  set  of  phases  for  a  sampled  quantised  phase  hologram.  The  parameterisation 
of  this  data  set  is  dealt  with  in  the  next  section. 

To  find  and  H'(u),  i.e.  to  implement  Eqs.  (4)  and  (5),  we  use  the  iterative 

Fourier  transform  algorithm  (IFTA)  [4].  This  is  a  computationally  efficient  non-linear 
optimisation  technique  that  employs  the  projection  method,  A  fast  Fourier  transform  is  used 
to  project  a  sampled  complex  data  set  between  signal  and  hologram  planes.  In  these  planes 
the  data  set  is  subject  to  a  number  of  constraints  stemming  from  the  design  problem  stated 
above.  In  the  signal  plane,  while  the  phase  is  a  free  parameter,  the  magnitude  of  the  complex 
signal  is  maintained  at  the  correct  level.  In  the  hologram  plane  the  constraints  ensure  that  the 
requirements  for  a  quantised  phase  only  element  holds.  The  optimal  signal  phase  is  found 
and  (Z)  determined  by  alternating  between  signal  and  hologram  planes  for  a  fixed  number 
of  iterations  (typically  100).  A  final  projection  provides  H'{u).  By  running  the  algorithm 
with  different  random  starting  values  for  the  signal  phase  we  ensure  that  stagnation  does  not 


occur. 
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3  Trapezoidal  parameterisation 

If  we  assume  that  the  profile  is  smooth  between  sample  points  (which  is  the  case  if  a  large 
enough  number  are  used)  then  the  sample  points  are  transition  points  of  a  complicated  phase 
topology.  At  present  no  design  concepts  exist  to  describe  a  hologram  of  arbitrary  shaped 
pulses  in  closed  form,  A  parameterisation  does  exist,  however,  to  approximate  the  topology 
using  trapezoidal  pulses  confined  to  strips  [5].  If  we  define  this  parameterisation  as  the 
operator  (P  then  the  phase  profile  becomes  ^0 \u)  =  (?^[0 '(«)]. 

It  is  useful  in  the  final  stages  of  the  DPE  design  to  define  a  parameter  to  describe  how 
close  the  DPE  signal  approaches  the  ideal.  Here  we  use  the  reconstruction  error.  A/?,  defined 
in  Ref.  [4].  A/?  is  a  measure  of  the  maximum  relative  deviation  of  the  DPE  signal  intensity 
from  the  average  power.  For  most  applications  AR  <  0.1,  although  some  optical  computing 
applications  are  more  demanding  and  require  AR  <  0.01. 

Using  the  closed  form  power  spectrum  for  a  DPE  with  trapezoidal  features,  AR  can 
be  minimised  with  a  suitable  non-linear  optimisation  algorithm.  The  choice  of  algorithm  is 
simplified  by  noting  the  hologram  is  “pre-optimised”  for  diffraction  efficiency  and  the 
trapezoidal  approximation  should  be  close  to  the  ideal  solution  in  hologram  space.  AR  is 
highly  sensitive  to  movement  of  transition  points  and  an  optimisation  routine  that  can  take 
advantage  of  a  near-optimal  solution  is  desirable.  Two  approaches  that  fit  the  criterion  have 
been  implemented;  Brent’s  method  and  a  greedy  search  (GS).  In  both  cases  the  cost  function 
to  be  optimised,  M,  is  defined  in  Ref.  [4].  Brent’s  method  [6]  minimises  AR  by  fitting 
changes  in  M  from  transition  point  movement  to  a  parabola.  By  consecutively  finding  the 
minimum  point  of  the  parabola  (and  thus  the  corresponding  minimising  transition  point)  for 
each  transition  point  in  the  DPE,  AR  approaches  zero.  This  method  will  only  work  for  a  cost 
function  that  behaves  as  a  parabola  around  the  global  minimum;  that  is  the  case  here.  The  GS 
algorithm  is  perhaps  the  oldest  of  all  non-linear  optimisation  methods  and  is  generally 
considered  to  be  too  computational  intensive  for  effective  use.  However,  because  of  the 
almost  optimal  problem  presented  this  is  not  the  case  and  the  ease  of  implementation  of  the 
GS  method  becomes  an  advantage.  The  GS  algorithm  works  by  moving  successively  through 
the  transition  points  with  a  steadily  decreasing  trial  change  to  test  if  this  decreases  M. 

At  this  point  in  the  optimisation  a  further  design  constraint  is  added  to  the  design 
problem.  As  the  DPE  is  designed  within  the  paraxial  domain  care  must  be  taken  to  ensure 
that  the  feature  size  of  the  trapezoidal  pulses  does  not  become  too  small.  This  would  both 
invalidate  the  paraxial  approximation  and  increase  AR  by  the  introduction  of  high  spatial 
frequencies  in  ^0  Xu). 

4.  Experimental  results 

A  number  of  DPE  array  illuminators  have  been  optimised  for  a  variety  of  fanout  sizes  for 
both  Z  -  4  and  Z  =  8;  a  selection  of  results  are  presented  in  Table  1.  For  the  even  fanout 
gratings  the  even-orders  missing  (EOM)  symmetry  [6]  was  used  to  further  simplify 
optimisation.  An  important  point  to  note  is  the  slight  decrease  in  11,(2)  after  the  fini 
optimisation  step.  This  is  a  result  of  both  the  parameterisation  and  the  optimisation  step  to 
reduce  AR.  However,  this  reduction  is  less  than  5%  and  can  be  taken  into  account  in  the 
early  stages  of  the  DPE  design.  In  all  cases  AR  is  within  the  limit  required  by  digital  optics. 
Table  1  also  provides  a  measure  of  the  optimisation  time  required  for  ihc  final  step,  which  is 
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Table  1.  Array  illuminators  designed  using  the  described  design  concept. 
Fanout  Size  Z  Sample  Size  rj,  (Z)  r[XZ)  AR  /  %  T['(Z)  t  (s) 


3x3 

4 

32  32 

0.787 

8 

32  32 

0.892 

5x5 

4 

64  64 

0.791 

8 

64  64 

0.885 

7x7 

4 

64  64 

0.787 

8 

64  64 

0.878 

9x9 

4 

64  64 

0.782 

8 

64  64 

0.879 

11  X  11 

4 

128  128 

0.801 

8 

128  128 

0.892 

4x4(EOM) 

4 

32  32 

0.823 

8 

64  64 

0.912 

8x8(EOM) 

8 

64  64 

0.873 

0.774 

0.219 

0.013 

7 

0.881 

0.250 

0.011 

13 

0.774 

0.400 

0.017 

136 

0.868 

0.500 

0.010 

197 

0.760 

0.050 

0.027 

416 

0.867 

0.004 

0.011 

912 

0.761 

0.004 

0.021 

2377 

0.861 

0.496 

0.018 

6352 

0.750 

0.049 

0.051 

5399 

0.865 

0.665 

0.027 

7485 

0.801 

0.589 

0.022 

142 

0.910 

0.589 

0.002 

142 

0.857 

0.190 

0.020 

- 

generally  greater  than  the  IFTA  phase,  especially  for  high  fanout,  but  still  remains  low 
enough  to  make  the  design  method  practical  (optimisation  took  place  on  a  Sun  SparcStation 


10/51). 


5.  Conclusion 

An  implementation  has  been  described  of  a  general  design  concept  which  uses  a  trapezoidal 
parameterisation  to  design  two-dimensional  DPE  array  illuminators.  The  design  concept 
provides  information  about  the  diffraction  efficiency  upper  bound  that  can  be  obtained  for  a 
given  signal  as  a  function  of  allowed  phase  levels.  The  final  DPEs  have  diffraction 
efficiencies  which  fall  within  5%  of  the  upper  bound.  In  the  final  stages  a  trapezoidal 
parameterisation  has  been  used  to  maintain  the  phase  topology  obtained  from  optimising  the 
signal  phase.  This  parameterisation  also  provides  a  sufficient  number  of  transition  points  to 
reduce  signal  non-uniformity  to  a  level  demanded  to  by  today’s  DOE  applications. 

Acknowledgements.  This  joint  research  program  has  been  supported  by  NATO  under  a 
collaborative  research  grant.  The  research  at  Heriot-Watt  was  also  supported  by  SERC  under 
the  Scottish  Collaborative  Research  Initiative  in  Optoelectronics  (SCIOS). 

References 

[1]  Miller  J  M,  Taghizadeh  M  R,  Turunen  J,  Ross  N,  Noponen  E  and  Vasara  A  1993  Appl.  Opt.  32  2519-2525 

[2]  Luepken  H  and  Wyrowski  F  1994  Appl.  Opt.  (Submitted) 

[3]  Wyrowski  F  1992  Opt.  Comm.  92  119-126 

[4]  Wyrowski  F  and  Bryngdahl  0  1988  Opt.  Soc.  Am.  A  5  1058-1065 

[5]  Vasara  A,  Taghizadeh  M  R,  Turunen  J,  Westerholm  J,  Noponen  E,  Ichikawa  H,  Miller  J  M,  Jaakkola  T 
and  Kuisma  S  1992  Appl.  Opt.  31  3320-3336 

[6]  Press  W  H  (Ed)  \992  Numerical  Recipes  in  Fortran  (Cambridge  University  Press)  395-399 

[7]  Morrison  R  1992  J.  Opt.  Soc.  Am.  A  9  464-471 


Inst.  Phys.  Conf.  Ser.  No  139:  Part  II 

Paper  presented  at  Opt.  Compul.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  10 P  Publishing  Ltd 


247 


Demonstration  and  discussion  of  an  interlaced  fan-out 
interconnect 
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Abstract  . 

2D  fully  interconnected  processing  systems  where  the  input  array  is  dilute  can 
use  a  fan-out  interconnect  where  the  extra  channels  are  accommodated  within 
the  repeat  spacing  of  the  input  array. 


1.  Introduction 

A  commonly  used  2D  fan-out  (fan-out  A),  where  the  input  array  is  replicated  at  neighboring 
spatial  locations  which  are  located  at  distances  of  at  least  the  array  size  from  the  original 
array,  is  illustrated  in  Fig.  1 . 


0  ^ 


E3  S  ^ 


Fig.l.  Conventional  2D  array  fan-out. 

A  complementary  fan-out  is  illustrated  in  Fig.  2  (fan-out  B).  It  is  complementary  in  the 
sense  that,  whereas  A  is  useful  when  the  subsequent  fan-in  is  to  an  array  of  larger  size  repeat 
spacing  than  the  input,  B  is  useful  when  the  output  array  is  of  smaller  repeat  spacing.  This 
can  be  useful  in  cascaded  interconnect  systems,  for  example. 

®  ^  0  0  ^  s 

Q  S  IS  S 


Fig.2.  Interlaced  2D  array  fan-out. 
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A  further  advantage  of  B  is  that  it  allows  lower  resolution  fan-out  gratings  to  be  used 
than  A.  This  in  turn  eases  the  fabrication  tolerances  of  the  grating.  The  reduced  fan-out  angle 
also  allows  more  compact  systems  to  be  envisaged. 


2.  Demonstration  of  interlaced  fan-out 


array  C  array 
T 
V 


Fig.  3.  Component  layout  of  interlaced  fan-out  (polarisers  omitted);  fLA  is  the  focal  length  of 
each  lenslet  array,  and  fL  is  the  focal  length  of  the  focussing  lens. 


The  layout  of  the  optical  system  is  shown  in  figure  3,  Collimated  light  of  6  mm  diameter 
from  a  HeNe  laser  beam  is  used  in  this  pilot  experiment,  and  the  apertures  of  the  beamlets 
(of  which  two  are  shown  in  figure  3)  are  defined  by  the  apertures  of  the  lenslets.  In  a  real 
application  of  this  interconnect,  the  beamlets  would  be  defined  by  the  source/modulator 
array  and  associated  optics.  The  collimated  beam  is  incident  on  two  crossed  8x1  Dammann 
gratings  of  690  p-m  and  567  |im  periods  respectively,  which  were  fabricated  at  the  Paul 
Scherrer  Institute,  Zurich  (PSIZ).  A  lenslet  array  of  25  mm  focal  length  and  1.15  x  1.4  mm 
repeat  spacing  (also  fabricated  at  PSIZ  [1])  acts  as  an  array  of  Fourier  Transform  lenses  to 
form  multiple  fan-outs  in  the  focal  plane  of  the  lenslets.  In  the  focal  plane,  a  liquid  crystal 
television  (LCTV)  (Seiko-Epson  VPJ-2000)  is  placed.  The  grating  periods  and  the  focal 
length  of  the  lenslets  are  chosen  such  that  an  8  x  8  array  of  spots  is  formed  by  each  lenslet  on 
the  56  |im  x  46  \im  repeat  spacing  of  the  pixel  modulators  of  the  LCTV.  The  repeat  spacing 
of  the  lenslets  is  chosen  such  that  25  pixel  repeat  spacings  separate  each  array  of  8  x  8 
(Figure  4a).  A  lenslet  array  of  the  same  specifications  follows  the  LCTV,  thus  forming  a 
telescope  of  unity  magnification. 


(a) 
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(b) 

Fig.  4.  Layout  of  illuminated  regions  on  the  LCTV  (a),  and  rear  focal  plane  of  lens  (b). 
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A  100  mm  achromat  fans-in  the  parallel  beams  from  each  8x8  array  onto  a  spot  on 
the  camera.  Therefore,  spot  (2,1)  in  figure  4b  has  received  light  from  all  the  pixels  (2,1)  in 
the  interconnection  matrix  in  figure  4a.  In  the  present  experiment,  the  distance  between  the 
second  lenslet  array  and  the  lens  is  120  mm.  We  believe  that  this  distance  is  noncritical. 


3.  Results 

In  the  rear  focal  plane  of  the  lens  (output  plane)  there  is  just  a  single  8x8  array  as  would  be 
formed  by  the  combination  of  Dammann  gratings  and  100  mm  focal  length  lens  (Fig.  4b). 
When  the  number  of  columns  of  open  apertures  in  the  LCTV  is  reduced  from  8  to  3  (Fig. 
5a),  a  8  X  3  array  results  in  the  output  plane  (Fig.  5b).  The  faint  appearance  of  the  5th 
column  in  figure  5b  is  due  to  the  limitations  in  the  LCTV  address  electronics.  Although  the 
5th  column  is  shut  off,  electronic  crosstalk  from  the  6th  column  gives  the  pixels  a  finite 
transmission. 


Fig. 5.  Layout  of  8  x  3  apertures  on  the  LCTV  (a),  and  rear  focal  plane  of  lens  (b). 


4.  Discussion 

There  are  a  number  of  uses  for  this  type  of  fan-out.  In  the  present  case,  it  was  developed  in 
the  design  procedure  for  an  optoelectronic  multilayer  Perceptron  [2].  The  intermediate 
processing  plane  (PP)  in  this  Perceptron  has  the  same  physical  dimensions  as  the 
interconnection  plane.  The  two  design  options  are  either  to  demagnify  the  output  of  the  PP 
and  use  a  conventional  fan-out  (Fig.  1)  or  to  maintain  the  size  of  the  output  of  the  PP  and  use 
an  interlaced  fan-out  (Fig.  2).  The  latter  eases  the  fabrication  tolerances  on  the  grating  and 
allows  larger  spectral  bandwidth  illumination  sources,  due  to  the  reduced  fan-out  angle 
requirement.  Moreover,  since  both  the  lenslets  and  lens  work  at  small  field  angles,  no  special 
lens  design  features  are  required.  We  expect  improved  spot  sizes  in  the  output  plane  in  a 
optimised  system  because  the  fan-in  is  incoherent  across  the  full  aperture  of  the  lens. 

A  further  potential  application  area  arises  when  the  Hadamard  product  of  a  matrix  and 
its  transpose  is  required,  eg  in  Delta  rule  and  backward  error  propagation  learning  in  neural 
networks  [3].  The  matrix  which  results  from  a  conventional  fan-out  of  the  letter  L  is 
represented  in  figure  6a. 
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Fig.  6.  The  Hadamard  product  of  (a)  and  (b)  forms  a  Delta  rule  weight  update  matrix  (c). 

The  matrix  which  results  from  an  interlaced  fan-out  is  represented  in  figure  6b.  This 
matrix  (Amj)  can  be  obtained  by  transposing  the  high  frequency  row  and  column  indices  of 
the  matrix  in  figure  6a  (Aiju).  Holographic  techniques  have  also  been  developed  for 
performing  this  transposition  [4].  The  point-by-point  or  Hadamard  product  of  these  two 
matrices  (Fig.  6c)  is  the  matrix  which  is  required  for  updating  the  interconnection  matrix  in 
Delta  rule  learning.  Similar  outer  products  between  the  neural  output  matrices  and  the  eiror 
matrices  are  required  for  weight  correction  in  backward  error  propagation  learning.  Also, 
outer  products  are  used  to  form  the  weight  matrix  in  high-order  networks  [5]. 

The  hardware  for  an  optically  updatable  interconnection  matrix  is  being  developed 
which  uses  Delta  rule  learning  [6].  The  optical  system  around  this  learning  chip  will  have  to 
generate  matrices  corresponding  to  Figs.  6a  and  6b.  Therefore,  the  interlaced  interconnect 
could  be  useful  in  this  context. 

Finally,  it  should  be  noted  that  the  latest  generation  of  optical  array  sources/modulators 
are  sufficiently  dilute  to  wairant  consideration  of  an  interlaced  interconnect.  For  example, 
vertical  cavity  surface  emitting  lasers  have  large  spacings  (up  to  250  p,m)  in  8  x  8  array 
format  [7].  Smart  SLMs  have  large  spacing  between  the  modulators  (up  to  400M-m)  for 
electronic  processing  at  each  pixel.  Moreover,  the  desire  to  build  compact  systems  with  these 
devices  demands  an  interconnection  technology  which  eases  the  fabrication  requirements  on 
the  components.  Interlaced  fan-out  does  this  for  the  gratings,  but  demands  high  quality 
lenslet  arrays  as  a  complement.  It  is  hoped  that  the  present  emphasis  on  fabncating  lenslet 
arrays  succeeds  in  producing  high  quality  components. 
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Investigations  into  the  use  of  lenslet  arrays  in  optical 
signal  processing  and  computing. 
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Abstract.  This  paper  describes  several  novel  components  and  Fourier  Transform  -  related 
techniques  for  tlie  generation  of  very  large  uniform  arrays  of  spots.  One  technique  also  forms 
a  valuable  assessment  tool  for  arrays  of  components  such  as  lenslets. 


1.  Introduction  to  array  generators 

The  generation  of  large  arrays  of  optical  spots  is  of  great  importance  for  many  optical  signal 
processing  and  optical  computing  applications  such  as  the  matrix  vector  multiplier. Several 
different  array  generation  techniques  exist  such  as  lenslet  arrays^^^  and  Dammann  gratings. 
However  the  generation  of  large  arrays  (-1000*1000)  of  uniform  spots  still  proves  to  be 
a  challenging  problem.  For  example  in  the  case  of  the  lenslets  being  used  in  the  usual 
manner  (figure  la)  the  output  array  is  sensitive  to  defects  and  the  nonuniform  intensity 
profile  of  the  illuminating  beam.  In  this  paper  the  authors  describe  several  new  methods 
using  either  lenslet  arrays  or  a  structure  referred  to  as  the  chirp  Dammann  grating  which 
are  potentially  capable  of  producing  large  uniform  array  of  spots. 


aD 


ARRAY 


Figure  1  Spot  array  generation  using  lenslets  in  a)  the  direct  mode  and  b)  a  self-imaging  mode. 


1,1  Self-imaging/Talbot  effect  for  defect  correction 


Lenslet  arrays  illuminated  with  laser  light  exhibit  a  "self-imaging"  effect  (figure  lb)  related 
to  that  recently  reported  by  Heaton  et  al  in  integrated  optic  waveguides, and  to  the 
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classical  Talbot  effect.  The  self-imaging  effect  can  be  used  to  produce  replicas  of  the 
lenslets’  focal  plane  with  different  multiplicity  ie  number  of  spots  and  spacing.^^i  An 
additional  advantage  over  the  direct  method  of  use  is  that  local  defects  can  be  "healed”. 
Figure  2  illustrates  this  for  a  lenslet  array  with  one  lens  blacked  out;  Figure  2a  shows  the 
focal  plane  with  the  spot  missing  and  2b  shows  a  self  image  plane  with  the  spot  restored. 


Figure  2  Intensity  plots  of  a)  the  focal  plane  of  a  lenslet  array  with  missing  focal  spot  and  b)  a  self-image 
plane  with  "self-healed"  spot.  The  "healing"  arises  because  each  output  spot  derives  from  many  input  spots 
as  shown  in  Figure  lb. 

1.2  Chirp  Dammann  gratings 

An  alternative  approach  to  the  generation  of  uniform  spot  arrays  is  the  use  of  Dammann 
gratings. To  date  the  design  of  such  gratings  has  been  non-analytical  and  confined  to 
arrays  of  modest  size.f^^  Here  a  systematic  approach  to  the  initial  design  of  such  gratings 
can  be  demonstrated  for  the  generation  of  large  arrays  -  the  result  is  a  structure  called  the 
chirp  Dammann  grating,  since  it  resembles  a  surface  acoustic  wave  (SAW)  chirp  filter^®’^^ 
This  structure  is  also  closely  related  to  a  one-sided  classical  zone-plate  and  phase-reversal 
diffractive  lens.  Figure  3a  shows  the  calculated  Fourier  Transform  (FT)  of  such  a  device, 
displaying  regions  of  uniform  intensity  either  side  of  the  absent  zero  order.  Note  that  the 
profile  is  not  perfectly  flat  as  it  suffers  from  Gibbs  phenomenon.  Computer  simulations 
show  that  this  can  be  dramatically  smoothed  by  apodising  the  grating.  The  FT  of  an  array 
of  such  chirp  gratings  is  an  array  of  spots  under  the  envelope;  Figure  3b  shows  a  small 
section  of  the  FT  of  a  1x6  array  of  such  chirp  gratings. 

1.3  Lenslets  in  the  Fourier  Transform  mode 

This  line  of  reasoning  has  brought  us  full-circle  back  to  the  use  of  (eg  refractive)  lenslet 
arrays  for  spot  array  generation,  not,  however,  used  simply,  or  in  a  Talbot  self-image 


253 


Figure  3  (a)  Calculated  FT  of  a  single  ID  binary  phase  chirp  grating  showing  two  regions  of  uniform  intensity 
either  side  of  the  absent  central  order,  (b)  Detail  of  the  calculated  FT  of  a  1x6  array  of  chirp  gratings. 


plane,  but  as  a  replacement  for  the  Dammann  grating,  ie  in  an  FT  mode,  figure  The 
use  of  lenslet  arrays  in  this  manner  retains  the  defect  self-healing  property  discussed  above, 
and  is  potentially  suitable  for  the  generation  of  very  large  arrays,  say  1000*1000,  or 
greater.  This  is  a  consequence  of  the  large  space-bandwidth  product  (SBP)  of  the  individual 
lenses.  In  fact  the  use  of  lenslets  and  the  chirp  Dammann  grating  contrasts  with  other  array 
generator  methods  since  they  operate  best  for  large  numbers  of  spots,  and  perfectly  in  the 
limit  of  infinite  arrays.  However  for  finite  arrays  the  use  of  computers  will  be  necessary  to 
optimise  performance,  as  in  the  case  of  A  different  use  of  this  approach  is  as  a 

non-mechanical  means  of  assessing  the  average  quality  of  the  lenslets  since  the  FT  of  an 
array  is  dependent  on  the  profile  of  the  individual  lenses. 

Calculations  show  that  a  spherical  lens  of  square  aperture  provides  a  good 
approximation  to  a  square  array  of  uniformly  intense  output  spots.  This  has  been  verified 
by  measuring  the  FT  of  a  convex  lens  masked  with  a  square  aperture,  Figure  5.  This  shows 
a  quasi-uniform  envelope,  accompanied  by  Gibbs  phenomenon.  As  in  the  case  of  the  chirp 
Dammann  grating,  the  use  of  lenslet  arrays  results  in  an  array  of  spots  under  this  envelope, 
the  number  of  which  is  determined  by  the  SBP  of  the  lenslets  and  not  by  the  total  number 
of  lenses  illuminated.  Similarly  apodisation  results  in  more  uniform  spot  intensities. 
Furthermore,  and  in  contrast  to  most  other  techniques  devised  to  date,  this  lenslet  array 
approach  does  not  suffer  the  problem  of  an  actual  or  incipient  zero-order  (dc)  spot  of 
high/variable  intensity  due  to  manufacturing  imperfections. 


Figure  4  Schematic  of  large  array  generation  using  Figure  5  Measured  FT  (intensity)  of  a  single  lens 
lenslets  in  a  FT  mode  of  square  aperture  under  laser  illumination. 
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2.  Conclusions  and  discussion 

It  is  clear  that  the  use  of  the  FT  mode  and  other  associated  modes  has  many  advantages 
over  other  techniques  for  the  generation  of  large  spot  arrays.  For  example  the  self-imaging 
phenomenon  demonstrates  the  ability  to  perform  local  defect  correction,  as  well  as  the 
capability  to  produce  a  multiplication  in  the  number  of  spots.  This  has  obvious  attractions 
when  considering  the  possible  introduction  of  defects  and  dirt  during  lenslet  manufacture 
and  use.  An  assessment  of  the  average  shape  and  quality  of  lenslets  in  an  array  can  be 
performed  using  the  FT  of  the  whole  array.  This  may  provide  a  quick  and  inexpensive 
means  of  assessing  an  array  without  the  need  to  inspect  individual  lenslets. 

The  potential  capability  of  generating  large  arrays  (>>100=^100)  of  spots  of 
nominally  equal  intensity  from  small  numbers  of  elements  ie  lenslet  or  chirp  Dammann 
gratings  has  important  implications  for  the  manufacture  of  array  generators.  The  chirp 
Dammann  grating  approach  employing  binary  phase  gratings  is  inherently  prone  to  the  dc 
spot  problem  as  a  consequence  of  variability  in  the  manufacturing  process.  However  this 
problem,  along  with  interference  from  unwanted  higher  grating  orders,  can  be  alleviated  by 
designs  which  isolate  the  first  grating  orders  spatially  to  form  a  block  of  four  output  arrays. 
Such  a  structure  may  itself  be  a  beneficial  feature  of  the  subsequent  SLM,  as  it  provides 
better  access  to  the  electronic  addressing  signals,  the  performance  of  current  SLM’s  often 
being  limited  by  addressing/multiplexing  difficulties.  Techniques  exist  to  reformat  such  an 
array  into  one  larger  fully-filled  array  if  so  desired. 
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Abstract.  A  Dammann  grating  for  65X65  spot  arrays  has  been  optimized  using  a 
simulated  annealing-tempering-generalized  reduced  gradient  (SAT-GRG)  optimized 
algorithm  and  fabricated.  A  crossover  optical  interconnect  network  of  64X64  pixels 
has  been  performed  with  a  Dammann  grating. 


1.  Introduction 

A  Dammann  grating  is  a  binary  phase  grating  that  can  be  used  to  generate  equal  in¬ 
tensity  spot  arrays.  Recently,  interest  in  Dammann  gratings  has  been  stimulated 
by  the  development  of  optical  computing.  In  optical  computing  systems ,  the  sizes  of 
arrays  (N)  as  high  as  64X64,  or  more,  are  required.  Due  to  complexity  of  optimiza¬ 
tion  in  the  case  of  large  spot  arrays,  the  application  of  Dammann  gratings  will  be 
limited.  In  order  to  optimize  Dammann  gratings ,  we  present  an  efficient  optimized 
algorithm:  simulated  annealing-tempering-generalized  reduced  gradient  algorithm 
(SAT-GRG).  Dammann  grating  with  65X65  spot  arrays  is  fabricated  and  used  in  a 
crossover  optical  interconnect  network. 

2.  Optimization  and  fabrication  of  the  Dammann  grating 

The  principle  of  Dammann  gratings  is  based  on  Fraunhofer  diffraction  theory. 
Dammann  gratings  are  illuminated  by  plane  wave  from  a  laser  source.  The  output 
pattern  appears  in  the  back  focal  plane  of  a  Fourier  converging  lens.  The  amplitude 
of  the  diffraction  pattern  is  given  mathematically  by  the  Fourier  transform  of  the 
complex  amplitude  transmittance  of  the  grating  G(x,y),  where  G(x,y)  is  a  binary 
periodic  function.  Since  the  output  pattern  is  supposed  to  be  separable  in  the  x  and  y 
directions,  so  the  complex  amplitude  transmittance  in  the  x  direction  g(x)  can  be 
given  by: 


gix)  =  ^A^exp(2mmx) 


(1) 
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A«  represent  the  amplitudes  of  the  diffraction  orders  in  the  Fourier  plane ,  can  be 
expressed  as  • 

Ao  =  4  2  (2) 

fi-i 

N 

=  2/(inn)  2  (-l)^“'^^^5in(2JnnxJ  m  7^  0  (3) 

the  diffraction  intensity  of  m-th  order  Im  can  be  written  as : 

Im=\A„\^  (4) 

For  the  optimization,  we  define  an  objective  function; 

N 

Merit  =  2 

m-=-N 

N 

/=’?/(2N  +  l),  7=  2^^^^^ 

where  rj  is  the  diffraction  efficiency  of  the  Dammann  grating,  Im(x)  is  the  diffraction 
intensity  of  the  m-th  order,  1  is  the  mean  value  of  the  intensity.  In  order  to  obtain  e- 
qual  intensity  light  spots,  the  intensities  Im  of  all  diffraction  peaks  must  be  equal; 

7o(x)  =  /±i(x)  =  .  =  ^±n(^)  (6) 

Considering  errors  produced  in  the  fabricating  process,  the  objective  function  has 
been  rectified.  Assuming  the  fabricating  error  factor  is  co,  the  phase  error  is  5  =  7rco/ 
2,  then  the  objective  function  after  rectifing  can  be  writen  as ; 

Merit  =  2[2  X  (/m  -  Im' yj  +  (7  -  7o)^  (7) 

>H=1 

lo  and  Im'  are  the  0-th  order  and  higher  order  diffraction  intensity  respectively. 

The  simulated  annealing  algorithm  (SA)  is  one  of  optimization,  but  its  disad¬ 
vantage  is  that  it  requires  very  long  operating  times  for  calculating  large  spot  arrays. 
In  order  to  reduce  the  calculation  time  and  enhance  the  calculation  efficiency ,  we  add 
a  tempering  process  to  the  SA  algorithm,  the  SAT  algorithm  avoids  too  much  time 
spent  at  the  local  minima  of  the  merit  function.  Furthermore,  on  the  base  of  the 
SAT  algorithm  we  add  a  generalized  reduced  gradient  algorithm  (GRG)  with  much 
higher  precision.  With  the  SAT-GRG  algorithm,  the  structure  of  a  65X65  spot  ar¬ 
ray  of  Dammann  grating  has  non-uniformity  of  0.  4%  and  diffraction  efficiency  of 
82%,  as  shown  in  Fig.  1. 

In  order  to  fabricate  Dammann  gratings  for  65X65  spot  arrays,  the  VLSI  tech¬ 
nique  is  used  in  our  experiment.  By  depositing  a  Si3N4  film  on  a  quartz  glass  sub¬ 
strate  using  a  PECVD  system  to  achieve  the  desired  thickness,  then  using  a  pho¬ 
tolithographic  mask  and  reactive  ion  etching,  the  phase  pattern  structure  of  the 
Dammann  grating  is  transfered  onto  the  Si3N4  film.  The  phase  etching  depth  is  given 
by; 


d  =  A/2(n-no) 


(8) 
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where  n  is  the  index  of  S^3N^  film  and  no  is  the  index  of  air.  The  photograph  of  the 
Dammann  grating  is  shown  in  Fig.  2. 


Fig.  1.  The  structure  of  65X65  spot  array  Dammann  grating 


Fig.  2.  Photograph  of  65  X  65  spot  array 


3.  Optical  implemention  of  a  crossover  network  with  64X64  pixels 
arrays 

The  crossover  interconnection  network  is  one  of  most  important  interconnection  net- 
works*-^^.  A  crossover  network  with  2N  channels  has  n=log2N  stages  of  interconnec¬ 
tion  links.  The  2N  channels  can  be  classified  in  two  groups ,  one  group  with  N  chan¬ 
nels  corresponds  to  a  straight  connection »  the  other  N  channels  accomplish  cross  per¬ 
mutation.  The  crossover  optical  interconnection  network  can  be  implemented  in  rela¬ 
tively  simple*  low  loss  and  economical  free-space  optical  hardwares.  It  can  be  used  in 
an  opto-electronic  hybrid  parallel  computer  system  to  connect  large  numbers  of  elec¬ 
tronic  processors.  This  system  has  the  advantages  of  both  the  high  intelligence  of 
electronics  and  the  massive  parallelism  of  optics,  which  could  enhance  the  speed  of 
current  computers. 

The  optical  setup  of  one  stage  of  the  optical  crossover  network  in  our  experi- 
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ment  is  shown  in  Fig.  3.  The  beam  from  the  laser  is  projected  onto  the  Dammann 
grating  (D)  through  a  combination  lens(Li).  In  order  to  demonstrate  exchange  in  the 
output  pattern  f  a  mask  with  character  "E'*  is  placed  on  the  focal  plane  of  lensCLj) » 
where  an  input  pattern  of  character  ”  E"  of  64  X  64  pixels  is  produced.  The  beam¬ 
splitter  (BS)  splits  the  input  pattern  into  two  different  paths.  The  path  E-Pg-CCD 
with  a  mirror  in  the  Pt  plane  implements  the  straight  connection  and  the  path  E-Pi- 
CCD  implements  the  cross  connection.  The  output  pattern  of  the  i  =  0  stage  for  cross 
connection  can  be  achieved  by  placing  a  reflecting  90®  prism  in  the  Pi  plane.  For  im- 
plemention  of  the  output  pattern  of  the  i-th  stage  cross  connection,  we  can  place  a 
prism  grating  of  period  Z  in  the  plane  of  Pi.  Using  six  such  setups ,  we  can  complete 
the  crossover  network  interconnection  function  for  64X64  spot  arrays. 


Fig.  3.  Optical  setup  of  one  stage  of  crossover  network 


4.  Conclusion 

A  Dammann  grating  for  generating  a  65X65  spot  array  has  been  optimized  using  the 
SAT  -GRG  algorithm  and  fabricated  by  thin  film  deposition ,  photolithography  and 
reactive  ion  etching.  A  crossover  optical  interconnection  network  for  64  X  64  spot  ar¬ 
rays  with  a  Dammann  grating  has  been  performed.  This  experimental  setup  shows 
the  advantages  of  equal  optical  path  length  and  no  light  energy  loss. 
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Demonstration  of  a  3D  integrated  refractive 
microsystem 
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Abstract.  A  3D  integrated  microsystem  which  uses  only  refractive  com¬ 
ponents  (planar  microlenses  and  microprisms)  is  demonstrated.  The  system 
performs  the  overlay  of  two  dataplanes.  The  output  plane  is  400//m  x  400/im 
and  contains  8x8  data  channels. 


1.  Introduction 

3D  integrated  microsystems  promise  a  higher  packaging  density  and  complexity  for  op¬ 
tical  interconnects  and  functional  elements  compared  to  2D  OEICs  (eg.  Brenner  1991, 
McCormick  1993,  Ozaktas/Goodman  1994).  However,  the  additional  degrees  of  freedom 
demand  more  efforts  for  aligning  and  mounting  the  system.  Two  different  approaches 
can  be  distinguished:  the  use  of  diffractive  components  has  the  advantage  that  different 
functions  (eg.  focussing  and  aperture  division)  can  be  easily  combined  in  one  element 
and  only  one  fabrication  technique  is  needed.  On  the  other  hand,  diffractive  elements 
suffer  from  a  high  sensitivity  to  wavelength  variations  and  the  fabrication  technique  ge¬ 
nerally  requires  facilities  with  submicron  alignment  accuracy.  Refractive  elements  are 
much  more  insensitive  to  wavelength  variations  and  can  be  produced  in  many  cases  by 
very  simple  means  which  are  standard  in  a  optical  workshop,  but  usually  different  tech¬ 
niques  are  used  for  the  production  of  different  elements  and  these  elements  have  to  be 
combined  in  order  to  build  a  system.  So  far,  mostly  diffractive  systems  were  demonstra¬ 
ted  (eg.  Brenner/Sauer  1988,  Jahns  1990,  Acklin/Jahns  1994).  We  present  a  very  simple 
refractive  microsystem  and  demonstrate  that  the  required  technologies  are  available  and 
manageable. 

2.  Design  of  the  system 

The  function  of  the  system  under  investigation  is  to  interlace  two  dataplanes  (Fig.  1). 
Possible  applications  of  this  functionality  are  symbolic  substitution  algorithms  or  in¬ 
terconnection  networks  (eg.  Brenner  et  al  1986,  Miller  el  al  1992).  Depending  on 
type,  number  and  relative  position  of  the  components  that  are  to  be  used,  there  are 
different  possibilities  for  the  optical  realisation  (Fig.  2).  Approach  a)  })erforms  the 
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Figure  1.  Overlay  of  two  dataplaiies 


a)  b)  c) 

Figure  2.  Three  possibilities  for  the  optical  realisation 

overlay  without  prisms  but  requires  prisms  in  the  input  plane  for  high  light  efficiency 
and  homogenous  illumination.  Approach  b)  uses  aperture  divison.  It  is  a  feasible  ap¬ 
proach  in  the  macrooptic  case  but  is  not  scalabel  to  microoptic  dimensions  out  of  two 
reasons:  First,  the  space-bandwidth-product  SBWP  scales  with  the  square  of  the  lens 
diameter,  resulting  in  typical  values  of  order  10^  in  the  microoptic  domain.  Therefore  it 
is  not  favourable  to  reduce  the  SBWP  further  due  to  the  aperture  division.  The  second 
problem  is  that  the  prism  edge  may  be  rounded  due  to  the  fabrication  process.  If  the 
edge  runs  across  the  center  of  the  lens,  it  will  reduce  the  image  quality  by  stray  light. 
Approach  c)  avoids  both  problems  by  shifting  the  prism  edge  between  two  lenses.  This 
also  results  in  lateral  alignment  tolerance  between  lenses  and  prisms  because  the  critical 
design  parameter  is  the  prism  angle  only.  Approach  c)  and  a)  are  equivalent  in  terms  of 
SBWP  and  numbers  of  optical  elements  but  not  in  terms  of  manufacturing  simplicity. 
Using  approach  c),  the  prisms  and  lenses  can  be  aligned  and  fixed  resulting  in  one  single 
element  whereas  in  case  a)  the  prisms  must  be  aligned  with  respect  to  the  lenses  and  to 
the  input  plane.  In  the  subsequent  experiment,  approach  c)  was  chos(m. 


3.  Fabrication 

3.J.  Leasts 

'The  lenses  are  planar  gradient  index  microlenses.  'These  lenses  were  fabricated  by  tlie 
Na,-Ag  ion  exchange  in  glass  which  has  bcxui  ix'ported  (^Isewhcre  (eg.  Brenner  ct  al 
1992).  'This  process  allows  the  fabrication  of  hnises  with  f#=b  and  less  than  a  (piarter 
wavelength  of  spherical  alxunation.  In  the  experiment,  kuises  with  dia, meter  D  =  200//77?, 
focal  length  /  =  llOO/cm  (in  air)  and  lens  ])itch  p  =  100////?  are  used. 


TT  Prisms 

The  prisms  were  made  by  thermal  moulding  and  casting  (Fig  1.).  First,  a  glass  master  ol 
the  prism  is  mad('  with  standard  techni(jU('s.  'The  glass  master  is  then  heated,  embossed 
in  a  PMMA  substratr'  and  cook'd  down  (a).  Kep/'ating  th('  process  generates  a  r('gular 


261 


array  of  identical  negativ  prisms  (b).  The  negativ  array  is  now  used  as  master  for  the 
production  of  the  final  prism  array.  For  this  purpose,  it  is  filled  with  UV-curing  glue  and 
covered  with  the  planar  microlens  array  (c).  Both  arrays  are  aligned  under  a  microscope 
and  exposed  with  UV.  After  curing  of  the  glue,  the  negative  is  separated  from  the  system 
(d).  Figure  4e)  shows  an  interferometric  measurement  of  a  section  of  a  negativ  prism. 


Figure  3.  Fabrication  of  the  prisms  by  thermal  moulding  and  casting 


4,  Experimental  setup 

A  chromium  mask  which  contains  circular  and  rectangular  holes  is  used  as  input  of 
the  system.  The  mask  is  illuminated  by  two  LEDs.  Each  LED  is  demagnified  by  a 
microscope  objective  and  imaged  in  the  plane  of  the  microlenses.  By  this  we  can  switch 
both  channels  independently.  The  output  plane  is  magnified  by  a  second  microscope 
objective  and  imaged  on  a  CCD-camera.  The  camera  is  connected  to  a  frame  grabber 
in  order  to  provide  facilities  for  quantitative  measurements.  The  LEDs  can  be  replaced 
by  a  halogen  lamp  and  colour  filters 


LEDs  nucroscope  objectives  CCD-camera 


Figure  4.  Experimental  setup:  a)  overall  setup,  b)  microsystem. 


5.  Experimental  results 

Fig.  5  shows  camera  images  of  both  channels  switched  independently  and  together  which 
demonstrates  the  required  functionality.  Table  1  gives  the  related  data,  for  the  S/N-ratio 
a  detector  area  (40//ni  x  40//m)  and  detector  separation  (lOpm)  equal  to  the  mask  was 
assumed. 
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Table  1.  Experiniental  data 


size  of  output  plane 

400p.m  X  400pm 

:1k  of  data  channels 

8x8 

size  of  rectangles/circles 

40/im  X  40/im  /  40pm  diam. 

separation  of  data  channels 

lOprn 

S/N-ratio 

17;1  (450nm),  11:1  (633nm) 

light  efficiency  (excl.  Fresnel  losses) 

89% 

peak/channel  homogeneity 

7%/4% 

Figure  5.  Output  of  the  microsystem.  Enlarged  OCD  image  of  the  overlaid 
data  planes  and  CCD  images  of  independently  switched  channels  together 
with  related  linescans. 
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Abstract.  Architectural  concepts  for  the  planar  microintegration  of  digital 
optical  data  processing  systems  are  presented.  From  these  concepts  a  micro- 
optical  system  consisting  of  stacked  layers  of  refractive  microcomponents  is 
derived.  The  optical  system  is  folded  using  off  axis  imaging.  The  feasability 
of  this  configuration  is  analysed  theoretically  and  prooved  by  experimental 
results. 


1.  Introduction 

Microintegrated  systems  for  digital  optical  information  processing  on  2D  data  planes 
impose  a  series  of  requirements  on  the  optical  realization.  Light  efficiency  is  favourable 
to  achieve  low  bit-error  rates  with  the  available  optical  power.  For  constructing  complex 
systems  the  design  has  to  be  modular.  It  should  be  possible  to  cascade  an  arbitrary  num¬ 
ber  of  processing  stages.  Only  a  few  technologies  should  be  necessary  for  the  production 
of  a  microintegrated  system. 


2.  Fourier  stages 

Optical  Fourier  transform  stages  are  necessary  in  free  space  optical  systems  to  perform 
spaceinvariant  operations.  Single  lens  2/  Fourier  stages  (fig.  1)  exhibit  certain  disadvan¬ 
tages  when  used  in  microoptical  setups  consisting  of  arrays  of  identical  components.  A 
lens  diameter  of  two  times  the  pupil  diameter  is  necessary  to  avoid  vignetting.  In  an 
integrated  system,  the  2/  configuration  requires  an  additional  spacer  in  order  to  place 
the  lens  in  the  center.  From  the  fabrication  point  of  view  it  would  be  more  desirable 
to  place  the  lenses  at  the  substrate’s  surface.  A  more  general  two-lens  configuration  is 
shown  in  fig.  2,  which  is  identical  to  the  2/  system  when  h  becomes  zero.  This  setup 
is  chosen  symmetrically,  because  the  space- bandwidth  product  reaches  a  maximum,  as 
was  shown  in  [1].  Assuming  symmetric  configurations,  the  number  of  resolvable  pixels 
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is  maximal  when  6  =  /  and  a  =  0  [4].  This  represents  the  ’light  pipe’  configuration. 
In  practical  implementations,  a  small  distance  a  is  often  required  for  technological  rea¬ 
sons.  With  microoptical  systems,  space-bandwidth  is  a  critical  parameter.  The  distance 
6  should  therefore  be  chosen  close  to  /  to  obtain  maximal  resolution. 


Input  FI* 

Figure  1.  2/  system 


Figure  2.  Two-lens  Fourier  setup 


3.  Pupil  division 

Digital  optical  systems  often  require  the  generation  of  multiple  copies  of  a  data-  plane 
and  the  superposition  of  different  dataplanes.  For  these  split  and  join  operations  pupil 
division  is  most  advantageous  since  the  join  operation  can  be  performed  almost  lossless 
for  an  arbitrary  number  of  joins.  Pupil  division  may  be  performed  either  by  splitting 
the  aperture  of  a  single  lens  or  by  using  a  lens  for  each  channel  in  the  pupil  plane. 
In  microoptical  setups,  multi-lens  pupil  division  is  preferable,  because  the  full  space- 
bandwidth  product  of  the  light  pipe  is  available  in  all  channels.  Additionally,  multi-lens 
pupil  division  can  also  be  used  for  space  invariant  nearest  neighbor  interconnections. 


4.  Layer  architecture 

A  transfer  of  these  considerations  to  a  micro -integrated  system  is  shown  in  fig.  3.  This 
layer  architecture  [2]  consists  of  two  modified  lightpipes  with  multi-lens  pupil  division. 
Precise  alignment  can  be  achieved  if  each  layer  is  produced  with  lithographic  precision. 
The  light  pipes  can  be  realized  on  one  substrate  with  microlens  arrays  on  both  sides. 
Mask  layers  are  introduced  on  additional  spacers.  The  prism  layers,  realized  by  embos¬ 
sing,  perform  space  invariant  split  and  join  operations.  Since  for  thermal  and  practical 
reasons,  active  devices  should  be  located  at  the  system  surfaces.  In  most  cases  active 
devices  on  one  surface  are  sufficient.  Cascading  of  several  stages  is  then  realized  by 
placing  a  mirror  layer  on  the  top  system  surface. 

J^.l.  Folded  systems 

The  most  compact  setup  is  acheived  when  the  optical  system  is  folded.  The  lenses  have 
to  be  used  in  an  off-axis  way  in  ill  is  case  [3].  Only  a  single  layer  of  microlenses,  a  layer 
of  active  coin[)onents  and  a  laye'r  ol  reflective  components  for  space  invariant  (shift,  split 
and  join)  operations  in  the  Fourier  domain  is  required.  Structured  mirrors  in  the  plane 
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Figure  3.  Layer  system  with  lenses,  prisms  and  masks 


of  the  active  components  may  be  used  for  space  variant  operations.  These  mirrors  can  be 
produced  together  with  the  electical  connectors  for  the  active  components  on  top  of  the 
microlens  array.  The  active  components  may  be  precisely  aligned  to  these  structures  via 
flip-chip  bonding.  The  tilted  mirrors  in  the  Fourier  planes  are  produced  by  embossing 
in  polymer  materials  and  thin  film  coating. 


Figure  4.  Microsystem  with  a  single  lens  layer 


4.2.  Off-axis  imaging 

Multi-lens  pupil  division  and  folded  optical  systems  require  off-axis  imaging.  The  resul¬ 
ting  aberrations  strongly  depend  on  the  type  of  lenses  used  and  the  location  of  the  field 
stops  in  the  Fourier  planes  and  the  lens  shape.  To  get  maximum  space-bandwidth  pro¬ 
duct  the  stops  should  be  placed  close  to  the  lenses  (Section  1).  The  off-axis  aberrations 
Coma  and  astigmatism  can  be  avoided  by  placing  the  stop  in  the  center  of  a  spherical 
refractive  index  distribution.  For  aspheric  distributions  or  flat  diffractive  lenses  full  cor¬ 
rection  is  acheivable  for  a  single  field  position  with  astigmatic  lenses.  Curvature  of  field 
causes  mainly  a  rotation  of  the  plane  of  best  focus  around  the  center  of  the  image  plane 
of  20  if  the  lights  incident  angle  is  0.  The  field  size  is  limited  by  the  resulting  defocus. 
The  defocus  inside  the  field  must  be  smaller  than  the  focal  depth  of  the  imaging  system. 
Distorsion  is  very  critical  in  digital  optical  systems.  It  does  not  occur  in  this  symmetrical 
systems  when  astigmatism  and  coma  is  corrected. 
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Figure  5.  Off-axis  imaging  with  a  spherical  microlens 


5.  Experimental  verification 

The  off-axis  imaging  setup  is  shown  in  fig.  5.  It  consists  of  a  ion  exchanged  microlens, 
with  spherical  refractive  index  distribution  (/  =  2.4mm  in  glass),  an  aperture  stop 
{D  =  200/im)  and  a  mirror.  A  mask  is  projected  in  the  front  focal  plane  of  the  microlens. 
This  object  is  imaged  by  the  microlens  back  into  the  same  plane.  This  image,  observed 
with  a  microscope  is  shown  in  fig.  5.  The  image  with  dimensions  of  i00fimx300fim  is 
free  of  coma  and  astigmatism  as  expected  from  the  theoretical  analysis. 

6.  Conclusion 

From  the  architectural  requirements  for  digital  optical  processing  and  the  techical  data  of 
existing  microcomponents  a  layer  architecture  was  derived.  A  single  layer  of  microlenses 
with  active  components  and  structured  mirrors  on  one  side  and  tilted  mirrors  on  the  other 
side,  is  sufficient,  to  perform  all  space  variant  and  space  invariant  operations  required 
for  optica]  processing  systems.  The  considerations  on  aberration  correction  were  proven 
by  an  imaging  experiment. 
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Abstract.  Phase-matched  Fresnel  elements  have  been  fabricated  for  laser  array 
to  fibre-cable  coupling,  fan-out  and  beam  shaping  operations.  The  continuous- 
relief  planar  microoptical  elements  are  produced  by  direct  laser  writing  in 
photoresist  and  copied  by  replication  techniques. 


1.  Introduction 

Parallel  optical  links  are  attracting  increasing  interest  in  the  fields  of  data  communications 
and  optical  computing.  Typical  applications  include  lens  arrays  for  light  source  to  fibre 
array  coupling  and  fan-out  elements  for  pareillel,  high-speed,  high-capacity  interconnects  for 
optical  processors  and  data  links.  This  paper  describes  progress  in  the  realisation  of  Phase- 
Matched  Fresnel  Elements  (PMFEs)  for  such  applications.  These  computer-generated 
Diffractive  Optical  Elements  (DOEs)  are  designed  with  segment  profiles  and  phase  steps 
optimised  to  maintain  proper  phase  relationships  at  the  design  wavelength  [1].  By  suitable 
choice  of  phase  step  and  segment  dimensions,  they  can  be  designed  to  combine  the  advan¬ 
tages  of  geometrical  and  diffractive  optical  components  and  their  optical  characteristics  can 
be  considered  as  resulting  from  a  combination  of  refractive  and  diffractive  behaviour. 

PMFEs  are  fabricated  as  surface-relief  microstructures  by  direct  laser  writing  in  photoresist 
and  can  be  reproduced  by  replication  techniques.  They  can  be  produced  with  very  high 
numerical  aperture,  arbitrary  aperture  shape  and  as  large  area,  close-packed  arrays. 
Examples  of  laser  diode  to  fibre  imaging  lens  arrays  with  high  numerical  aperture  and  of 
focusing  Nxl  fan-out  elements  are  described. 


2.  Fabrication  by  direct  laser  writing 

The  fabrication  of  phase-matched  Fresnel  elements  with  continuous-relief  surface  profiles 
represents  a  challenging  area  of  modem  optical  fabrication  technology,  in  particular  for  high 
aperture  lenslets  with  segment  sizes  in  the  order  of  micrometers.  Laser  writing  (Fig.  1) 
offers  a  highly  flexible  approach  for  the  fabrication  of  such  DOEs. 
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Fig.  1.  Continuous-relief  phase-matched  Fresnel  elements  are  fabricated  by  direct  writing 
in  photoresist.  Segment  profiles  and  phase  steps  are  optimised  to  maintain  proper 
phase  relationships  at  the  design  wavelength. 


Technology  for  direct  laser  writing  in  photoresist  has  been  developed  over  a  number  of  years 
at  the  Paul  Scherrer  Institute  in  Zurich.  The  system,  described  in  more  detail  elsewl^re  [2], 
uses  a  HeCd  laser  to  expose  a  photoresist  coated  substrate  which  is  raster  scanned  under  the 
focused  beam  using  a  high  precision  xy-stage.  The  surface-relief  microstructure  is  defined 
in  a  data  file  which  can  be  generated  or  computed  by  a  variety  of  standard  or  custom  design 
procedures.  This  data  is  used  to  control  the  exposure  of  a  resist-coated  substrate  which  is 
raster  scanned  under  the  focused  HeCd  laser  beam.  The  laser  beam  intensity  is 
synchronously  controlled  (256-level  gray  scale)  to  write  a  fully  programmable,  2- 
dimensional  exposure  pattern.  Development  of  the  resist  then  results  in  a  3-dimensional 
continuous-relief  microstructure  with  the  desired  surface  profile.  Continuous-relief  PMFE 
microstructures,  typically  with  minimum  feature  size  of  5  pm,  up  to  5  pm  depth  and  up  to 
about  50  X  50  mm^  in  size,  are  routinely  fabricated  with  this  system. 

PMFE  microstructures  can  be  characterised  by  the  maximum  relief  depth  and  the 
minimum  segment  size.  In  laser  writing  technology,  the  maximum  depth  is  given  by  the 
resist  layer  thickness,  typically  ~  5  pm.  The  minimum  segment  size  at  the  perimeter  of  the 
lenslet  area  is  determined  by  the  numerical  aperture  (NA)  of  the  lenslet,  together  with  the 
phase  (height)  step  at  the  segment  boundary.  The  minimum  segment  size  can  be  maximised 
by  choosing  the  largest  phase  step  possible  within  the  limits  of  the  resist  thickness  -  a  0.5  NA 
lens  with  87t  phase  step  requires  a  resist  thickness  of  at  least  4.4  pm.  On  the  other  hand,  the 
performance  of  structures  with  high  phase  steps  is  more  sensitive  to  errors  in  the  relief 
profile.  This  leads  to  microstructures  of  the  type  shown  in  Fig.  1,  in  which  the  phase  step  is 
increased  by  2k  each  time  the  segment  size  approaches  the  limiting  value  as  the  radius 
increases.  Lenslets  with  high  NA  (->0.5)  have  a  dominantly  diffractive  behaviour  and 
function  best  with  monochromatic  (laser)  or  narrow  band  (LED)  illumination. 

Resist  surface-relief  microstructures  are  electroformed  to  a  Ni  shim  and  reproduced  by 
replication  in  plastic  or  epoxy  materials.  Replication  technologies  which  are  commercially 
available  and  currently  under  investigation  include  hot  embossing,  injection  moulding  and 
uv-replication  techniques. 
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Fig.  2.  PMFE  array  for  the  coupling  of  a  laser  diode  array  into  a  fibre  ribbon  cable. 


3.  Fresnel  lens  arrays  for  laser  to  fibre  coupling 

High  numerical  aperture,  planar  microstructures  and  flexible  design  in  shape  and  array 
formats  are  attractive  features  for  the  application  of  PMFEs  in  the  coupling  of  laser  diode 
arrays  to  fibre  ribbon  cable.  An  example  (see  Fig.  2)  is  a  PMFE  array  which  has  been 
designed  for  coupling  an  array  of  laser  diodes  (>.  =  831  nm)  into  a  ribbon  of  12  fibres.  The 
laser  diodes  each  emit  an  astigmatic  beam  with  divergence  angles  0i=  8°  and  0//  =  28° 
(FWHM)  in  the  horizontal  and  vertical  planes. 

The  MT-connector  compatible  ribbon  cable 
consisted  of  12  multimode  fibres  with  0.21  NA 
and  50  pm  core  diameter.  The  PMFE  fabricated 
achieved  a  measured  efficiency  of  60%  with  an 
object  side  NA  of  0.5  [3] . 

Typical  laser  diodes  with  an  elliptical  near 
field  and  longitudinal  astigmatism  produce  an 
elliptically  shaped  image  at  the  fibre  plane. 

Anamorphic  PMFE  arrays  have  also  been 
designed  and  fabricated  for  beam-shaping  and 
circularising  the  image  to  obtain  a  better  match  to 
the  circular  fibre  geometry.  Fig.  3  shows  the 
measured  irradiance  distribution  at  the  fibre  plane 
for  such  an  anamorphic  PMFE.  The  1/e^  radius  of  Fig.  3.  Measured  irradiance  of  a 
the  circularised  image  is  about  16  pm.  circularised  image  at  the  fibre  plane. 

4.  Spatially  interlaced  lenses  for  fan-out  applications 

Novel  PMFE  structures  have  been  designed  to  perform  focusing  fan-out  functions  (see 
Fig.  4.)  in  a  single  planar  microoptical  element  [4].  The  desired  fan-out  function  is 
implemented  by  combining  different  PMFEs  (e.g.  one  for  each  interconnection  channel)  in  a 
special  area  sharing  arrangement.  Each  PMFE  is  divided  into  a  subarray  structure,  leading  to 
an  array  of  focused  diffraction  orders  centred  around  the  focal  point  of  the  basic  PMFE.  The 
period  of  the  different  subarray  structures  is  chosen  such  that  the  diffraction  orders  coincide 
with  and  are  coherently  superimposed  upon  the  desired  image  points.  Simple  and  fast 
procedures  have  been  developed  for  optimising  such  fan-out  PMFEs,  based  upon  a  low 
number  of  well-defined  physical-optical  parameters.  The  concept  can  easily  be  extended  for 
realising  2D  fan-out  elements. 
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Fig.  4.  (a)  Focusing  fan-out  function  using  interlaced  PMFEs  and  measured  irradiance. 

(b)  AFM  image  of  an  interlaced  PMFE  microrelief  (70  {im  x  70  |Xm ). 

A  major  advantage  of  the  PMFE  fan-out  approach  is  the  large  tolerance  with  respect  to  fabri¬ 
cation  errors.  In  a  series  of  experiments  in  which  depth  scaling  errors  over  a  range  of  ±20% 
were  introduced,  PMFE  fan-outs  showed  a  uniformity  error  of  <  3%.  This  is  much  superior 
to  the  performance  of  conventional  surface  relief  fan-out  elements  for  which  a  uniformity 
reduction  of  -10%  typically  results  from  a  depth  scaling  error  of  only  a  few  percent.  PMFEs 
with  a  5x1  fan-out  have  been  fabricated  and  evaluated.  Figure  4b  shows  an  Atomic  Force 
Microscope  (AFM)  image  of  the  surface  of  such  a  fan-out  PMFE  (-1  |xm  relief  amplitude) 
with  the  subarray  structure  clearly  visible.  The  measured  uniformity  error  was  about  2%. 


5.  Conclusions 

The  direct  laser  writing  of  planar  Fresnel  elements  has  been  developed  for  the  flexible  and 
reproducible  fabrication  of  microoptical  elements.  Novel  optima  imaging  properties  are 
obtained  using  Phase  Matched  Fresnel  Elements  (PMFEs)  in  which  the  segment  profiles  and 
phase  steps  are  optimised  for  the  given  wavelength  and  imaging  configuration.  Examples 
have  been  given  of  PMFE  arrays  for  the  coupling  of  diode  laser  arrays  to  fibre  ribbons, 
including  anamorphic  elements  for  circularising  the  image  irradiance,  and  of  PMFEs 
implementing  fan-out  functions  by  spatially  interlacing  different  elements.  Many  other 
novel  imaging  and  beam  shaping  functions  can  be  realised. 
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Abstract.  Scx  eral  schemes  that  decrease  the  calculation  lime  of  a  computer 
generated  hologram  through  utilising  sub-holograms  are  presented.  The  applicability  of 
these  schemes  to  differing  encoding  schemes  and  design  criteria  is  discussed. 


I.  Introduction 

The  performance  of  a  CGH  is  restricted  by  it's  limited  information  capacity.  This  limit 
arises  either  from  the  difficulty  in  calculating  the  CGH  or  from  the  resolution  of  the 
fabrication  device  used  to  realise  them.  For  example  calculating  the  hologram  using 
the  directed  binary  search,  DBS>,  algorithm  is  computationally  very  intensive  due  to 
the  number  of  points  which  have  to  be  tested  during  the  search. 

For  CGH  too  large  to  be  conveniently  directly  determined  an  alternative  design 
strategy  is  to  interlace  several  small  CGH  together.  This  poster  presents  a  new 
algorithm  involving  the  partitioning  of  the  CGH  into  sub-holograms  and  then 
calculating  these  sub-holograms. 


2.  Interlacing  methodologies 

Two  patching  methodologies  have  appeared  in  the  literature.  The  first  approach  is  to 
repeat  the  basic  hologram  pattern  periodically,  which  turns  the  initial  calculated 
structure  into  a  sub  hologram  and  increases  the  SBWP  of  the  final  hologram  without 
adding  any  extra  sampling  data  to  it.  This  technique  is  trivial  in  that  it  requires  no 
more  calculation  and  results  in  a  punctuated  replay  with  reduced  speckle  noise^. 

The  second  technique  is  the  Iterative  Interlaced  Technique,  IIT,  of  Ersoy  et  al.^ 
which  iteratively  determines  the  sub-holograms  prior  to  combination.  By  manipulating 
the  encoding  errors  of  the  sub-holograms  the  algorithm  improves  the  fidelity  of  the 
large  hologram's  replay.  The  IIT  is  applicable  to  all  CGH  encoding  schemes  and  offers 
higher  fidelity  replays  and  calculation  times  than  by  realising  a  single  large  hologram. 

Using  an  interlacing  algorithm  allows  the  realisation  of  holograms  faster  than  from 
a  single  large  hologram.  This  approach  is  faster  because  there  are  fewer  possible  pixel 
permutations  to  compare  in  each  sub  hologram  to  evaluate  it's  structure  and  there  are 
less  points  to  Fourier  Transform  during  the  calculation  process. 
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The  general  applicability  of  the  UT  methodology  is  limited  due  to  restrictions 
placed  upon  it  by  the  calculation.  For  example  since  it  uses  a  complex  amplitude  based 
cost  function  (to  allow  constructive  interference  between  the  sub-holograms  at  the 
replay  plane)  no  phase  freedom'^  is  available.  Furthermore  there  is  also  a  reduction  in 
the  available  amplitude  freedom^  -^  to  avoid  the  overlapping  of  shifted  images.  The 
patched  CGH  will  have  a  low  efficiency  because  the  resultant  replay  is  the  average  of 
all  the  sub-holograms  replays. 


3.  Partitioning  methodologies 

If  the  effects  of  all  the  sub-holograms  were  considered  simultaneously  an  intensity 
based  cost  function  could  be  employed  with  more  amplitude  freedom  available  to 
design  the  CGH.  This  is  the  basis  for  the  partitioning  algorithm,  here  demonstrated 
with  a  modified  DBS  algorithm.  We  refer  to  the  partitioning  version  of  DBS  as 
PDBS 

The  accelerated  convergence  of  PDBS  in  comparison  to  DBS  is  due  to  the 
minimisation  of  pixel  interaction  in  the  encoding  process  as  explained  in  the  discussion 
section.  The  FFT  carried  out  in  this  design  process  is  for  the  full  CGH. 

3.1.  PDBS  described 

PDBS  starts  by  determining  an  initial  structure  for  the  entire  hologram  The  hologram 
is  then  partitioned,  for  the  demonstration  calculations  a  32*32  pixel  CGH  is  divided 
into  four  16*16  pixel  sub-holograms.  The  structure  of  each  sub-hologram  is  then 
tested  using  a  normal  DBS  methodology,  Several  partitioning  approaches  have  been 
tested,  two  are  related  here,  the  difference  between  them  being  that  in  one  each  sub 
hologram  is  fully  determined  before  the  next  sub  hologram  is  optimised,  PDBS, 
whilst  in  the  other  a  single  DBS  iteration  is  conducted  for  each  sub-hologram  in  turn, 
PDBS2. 
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Figure  1,  Outline  for  PDBS 
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3. 2  Expet'imenis 

To  compare  the  performance  of  PDBS  to  DBS  several  CGHs  were  made  to  replay  an 
intensity  based  target  This  was  undertaken  using  random  pixel  configuration  starts 
and  also  optimum  starts  as  found  using  Wyrowski's  IFTA  algorithm-"^.  The 
performance  of  the  hologram  is  measured  in  terms  of  the  replay  error. 

3.3  Results 

Tabic  1.  Calculation  limes  and  rcsuliani  replay  errors  for  CGHs  designed  using  different  versions  of 

the  DBS  algorithm. 


Algorithm 

Start 

Run  time  /s 

Replay  error 

DBS 

Random 

5418 

0.000739 

PDBS 

Random 

2555 

0.003200 

PDBS2 

Random 

4522 

0.000807 

DBS 

Optimum 

5332 

0.000467 

PDBS 

Optimum 

4008 

0.001250 

PDBS2 

Optimum 

5010 

0,000573 

Rcpla\  from  a  PDBS  CGH  w  ith  four  sub 
holograms. 


Rcpla\  for  a  .^>2*32  DBS  CGH  from  a  good 
Stan. 


Figure  2  Computer  simulations  of  the  replay  s  of  some  of  Ihe  calculated  CGHs 


4  Discussion 

An  explanation  of  why  the  partition  searches  curtail  more  quickly  than  for  a  normal 
DBS  for  the  same  number  of  pixels  is  as  follows,  DBS  works  to  reduce  the  encoding 
errors  introduced  into  the  replay  through  using  a  binary  hologram"^  by  examining  the 
interaction  between  pixels.  The  smaller  the  region  of  pixel  interaction  considered  the 
larger  the  replay  error  of  the  final  hologram.  Thus  limiting  the  amount  of  pixel 
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interaction  limits  the  ability  for  DBS  to  find  a  good  solution  as  there  are  fewer  possible 
pixel  configurations  to  consider.  That  is  the  PDBS  algorithms  have  less  solution  space 
to  search  and  a  reduced  solution  space  takes  less  time  to  search  giving  the  modified 
DBS  algorithm  the  appearance  of  an  accelerated  convergence. 

For  all  of  these  partitioning  methodologies  the  resulting  CGH  is  also  a  solution  for 
normal  DBS  (i.e.  a  normal  DBS  starting  from  a  pixel  configuration  determined  by  a 
partitioning  algorithm  will  not  flip  any  more  pixels).  The  PDBS  results  appear  to  be 
poor  local  minima  because  the  PDBS  fidelities  are  much  lower  in  comparison  to  the 
fidelities  found  from  DBS  for  holograms  of  the  same  size. 

Whilst  the  partitioning  methodology  is  applicable  to  all  the  major  encoding 
schemes  (i.e.  DBS,  error  diffusion  and  the  IFTA)  it  is  most  appropriate  for  DBS 
which  has  the  longest  calculation  time.  A  possible  application  for  PDBS  is  in  an 
optical  learning  system  where  a  SLM  is  used  as  the  CGH  and  the  Fourier  transform  is 
undertaken  optically.  In  this  regime  the  CGH  is  generally  designed  using  DBS, 
applying  PDBS  would  allow  quicker  realisations  of  the  CGH. 


5  Summary 

The  partitioning  approach  is  a  method  for  accelerating  the  design  time  for  CGH  made 
by  the  DBS  encoding  scheme.  This  acceleration  is  achieved  by  the  use  of  sub¬ 
holograms  to  minimise  the  pixel  interaction,  the  trade  off  for  reducing  calculation  time 
is  an  increase  in  replay  error.  It  is  left  to  the  user  to  decide  upon  the  balance  between 
calculation  speed  and  replay  error.  Unlike  earlier  patching  schemes  the  partitioning 
approach  allows  the  sub-holograms  calculation  from  an  intensity  cost  function. 
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Abstract.  A  method  to  design  diffractive  elements  to  generate  3D  light  dis¬ 
tributions  is  presented.  Low  resolution  binary  amplitude  modulated  diffrac¬ 
tive  elements  are  designed  to  generate  “nondiffracting  beams”  and  arrays  of 
them  in  on-axis  configuration. 


Fourier  and  Fresnel  computer  generated  holography  are  usually  applied  to  generate 
a  desired  field  intensity  distribution  over  a  specified  transversal  plane  and  sometimes  on 
a  limited  depth  of  field. 

In, a  recent  letter  [1]  a  method  to  control  the  wavefront  propagation  over  long 
distances  in  the  Fresnel  and  Fourier  regions  was  presented.  Low  resolution  diffractive 
elements  including  phase  and  amplitude  were  designed  for  generation  of  the  so  called 
"nondiffracting  beams”.  In  this  paper  we  applied  that  method  to  generate  three  di¬ 
mensional  distributions  of  light  with  a  binary  amplitude  diffractive  element  in  on-axis 
configurations.  Preliminary  experimental  results  are  also  presented. 

Applications  of  the  technique  include  display  of  information,  spatial  distribution  of 
energy,  optoelectronic  interconnections,  precision  measurements  and  alignment. 


1.  Procedure 

The  problem  under  consideration  is:  ‘Given  a  known  WF  incident  on  a  thin  low- 
resolution,  amplitude  modulated  diffractive  element,  design  this  element  to  obtain  a 
desired  intensity  distribution  within  a  given  three-dimensional  domain.’  Fundamental 
physical  limitations  do  not  allow  a  solution  for  any  arbitrary  distribution.  However,  if 
a  solution  does  not  exist,  it  still  may  be  valuable  to  derive  the  closest  solution  to  the 
constraints. 

The  general  procedure  is  as  follows  (see  Fig.  1) 

1.  Define  a  region  D  behind  the  diffractive  element  and  the  desired  intensity 
distribution  /((^,7/,z)  with  appropriate  tolerances. 
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2.  Transform  77,  z)  into  constraints  on  sufficiently  close  transversal  planes  within 
this  region. 

3.  Impose  constraints  on  the  diffractive  element  (space-bandwidth,  amplitude  only 
modulation). 

4.  Solve  the  optimization  problem:  ‘Find  an  input  distribution  which  satisfies  the 
constraints  in  (2)  and  (3)  or  a  reasonable  approximation  according  to  certain  measure’. 

In  reference  [1]  we  calculate  the  maximal  distance  between  consecutive  transversal 
planes  that  assures  that  longitudinally  continuous  light  distributions  are  achieved. 


DIFFRACTIVE 

ELEMENT 


dl 

ARBITRARY  . 

WAVEFRONT  ^ . 


Figure  1.  The  basic  scheme  for  generation  of  3D  light  distributions 


The  algorithm  of  projections  onto  convex  sets  (POCS)  adapts  to  the  described 
method,  leading  to  good  designs  after  a  few  iterations. 

The  complex  amplitude  over  a  plane  at  a  distance  z  can  be  described  by  the  Fres¬ 
nel  transform  FrT.  The  constraints  prescribed  on  this  plane  can  be  transformed  into 
constraints  on  the  field  at  z  =  0+  and  represented  as  There  is  a  projection  operator 
associated  with  each  constraint  C^. 

Given  a  function  /  describing  the  input  field,  we  find  its  projection  onto  the  con¬ 
straint  by  calculating  the  field  at  the  distance  z,  projecting  this  function  onto  the 
constraints  of  that  plane  and  finally  performing  the  inverse  FrT  to  obtain  the  corre¬ 
sponding  field  at  the  input: 

g  =  T>J  =  FrT-’{p,FrT(/)}  (1) 

where  represents  the  projection  operator  on  the  plane  at  the  distance  z. 

If  we  take  n  planes  we  have  a  set  of  constraints  C  =  where  Cq  is  the 

constraint  at  z  =  0.  In  the  case  of  amplitude  modulated  elements,  this  constraint  can 
be  expressed  as  follows 


'Po{f{^,y)] 


Re[/1  if  Re[/]  >  0 
0  if  Rel/]  <  0 


(2) 


In  the  problem  under  consideration,  a  function  /  is  searched,  which  lies  simultane¬ 
ously  in  all  sets  Cj  [i  =  0, 


feC=C]C, 

t=l 


(3) 


Ill 


Figure  2.  a)  Binary  amplitude  diffractive  element  (128  x  128  cells,  100  x 
100/im)  for  generating  a  ’’nondiffracting  beam”,  b)  Experimental  result  of  the 
intensity  distribution  on  a  transversal  plane  ai  z  —  53cm.  The  peak  remains 
parallel  to  the  optical  axis  at  a  distance  1.2  mm,  along  10  cm. 


Figure  3.  a)  Cross  section  of  the  intensity  peak  at  z  =  50cm.  b)  Simulation 
result  at  the  same  distance.  The  peak  width  is  similar  to  the  diffraction  limit, 
c)  Simulation  (continuous  line)  and  experimental  measurements  of  the  peak 
intensity  along  the  selected  longitudinal  region. 


The  projection  onto  C  can  be  obtained  by  a  recursive  process  using  the  composite  oper¬ 
ator  T  =  VnVr,-i^..Vi: 


/m  =  =  ^/m-i  m  =  0,1,2...  (4) 

where  /q  is  an  arbitrary  initial  function. 

We  applied  the  method  to  design  low  resolution  elements  having  128  x  128  binary 
cells  which  work  in  on-axis  configuration.  This  is  attractive  because  there  is  a  reduction 
in  design  and  fabrication  efforts  and  real  time  devices  are  available. 

In  a  first  experiment  a  ’’nondiffracting  beam”  was  generated.  We  seeked  a  light 
distribution  that  presents  a  constant  intensity  peak  along  ten  centimeters  in  the  longitu¬ 
dinal  direction  z.  The  peak  should  be  situated  out  of  the  optical  axis  and  remain  parallel 
to  it  in  a  dark  region  around  the  axis  (see  Fig.  2). 

The  input  WF  is  a  plane  wave  and  the  binary  cells  are  100/zm  squared.  The  domain 
D  was  chosen  between  dj  =  50  cm  and  ^2  =  60  cm  for  A  =  630  nm.  An  amplitude  only 
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Figure  4.  a)  Binary  amplitude  diffractive  element  for  generating  a  non- 
symmetrical  array  of  ’’nondiffracting  beams”,  b)  Experimental  result  of  the 
intensity  distribution  at  2  =  52cm. 


mask  was  first  generated  and  then  direct  binarization  with  an  appropiate  threshold  was 
performed.  The  design  obtained  after  20  iterations  is  shown  in  Fig.  2(a).  Observe  that 
this  binary  mask  has  a  reduced  information  content.  The  element  was  produced  on 
chromium  coated  plates  by  photolithography. 

The  light  distribution  at  different  transversal  planes  was  measured  with  a  photodi¬ 
ode  camera.  Fig. 2(b)  shows  a  typical  transversal  distribution  at  ^  =  53cm.  In  Fig.  3a. 
we  show  an  experimental  cross  section  of  the  peak  region  together  with  the  result  of  the 
simulations.  The  peak  width  is  under  25yum,  comparable  to  the  diffraction  limit. 

The  graph  in  Fig.  3(b)  shows  the  experimental  constancy  of  the  peak  intensity  in 
the  selected  area  compared  to  the  simulations. 

The  second  example  is  an  array  of  five  “nondiffracting  beams”  instead  of  only  one. 
The  binary  diffractive  element  for  a  system  of  five  peaks  propagating  for  5  cm  is  shown  in 
Fig.  4(a).  The  transversal  intensity  at  2:  =  52  cm  is  shown  in  Fig.  4(b).  The  distribution 
starts  at  0  =  50cm  and  the  peak  width  is  25/zm  for  A  =  630  nm. 

In  conclusion,  preliminary  experiments  indicate  that  the  proposed  approach  leads 
to  efficient  solutions  for  various  problems  in  accordance  with  the  design  constraints  and 
the  simulations.  As  examples,  “nondiffracting  beams”  and  nondiffracting  beam  arrays 
were  demonstrated.  The  procedure  adapts  the  design  to  the  available  space  bandwidth 
and  to  the  imposed  constraints.  As  the  method  is  not  restricted  to  incident  plane  waves, 
the  design  of  diffractive  elements  can  be  adapted  to  any  WF  such  as  beams  coming 
directly  from  laser  sources. 
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Abstract.  A  complete  statistical  analysis  and  modeling  of  a  generic  three-plane 
optical  processor  is  described.  The  device  models  used  are  detailed  and  the  output 
signal  statistics  are  determined  for  a  number  of  interesting  special  cases.  The 
concepts  of  accuracy  and  precision  are  then  defined  and  the  problem  of  accuracy 
enhancement  is  formulated  in  detection-  and  estimation-theoretic  contexts. 


1.  Introduction 

The  most  important  and  prominent  features  of  optical  processors  are  their  speed  and 
parallelism  potentials.  However,  they  also  suffer  from  some  fundamental  drawbacks,  the  most 
critical  being  low  computational  accuracy,  that  limit  their  application  to  real-world  problems. 
In  view  of  the  intimate  (inverse)  relationship  between  the  highest  achievable  accuracy  and 
the  maximum  attainable  speed  and  degree  of  parallelism,  it  is  understood  that  a  mastery  of 
accuracy  limitations  and  enhancement  potentials  is  crucial  for  the  full  exploitation  of  the 
strengths  and  capabilities  of  optical  processors. 

The  noise  sources  that  limit  the  accuracy  of  an  optical  processor  fall  into  two  groups. 
On  one  hand,  we  have  the  system  noise  sources,  such  as  diffraction,  crosstalk,  and 
background  radiation,  that  arise  from  the  processor  architecture  and  operation.  On  the  other 
hand,  we  have  the  device  noise  sources  that  introduce  inevitable  inaccuracies  associated  with 
the  physical  representation  and  interpretation  of  the  information-bearing  signals  in  the  system. 
Of  particular  significance  here  are  the  noise  in  the  source  field  intensities  due  to  photon, 
excitation,  and  emission  fluctuations,  the  randomness  in  the  spatial  light  modulator  (SLM) 
intensity  transmittance  due  to  transmission  and  polarization  fluctuations,  and  the  noise  in  the 
detectors  due  to  nonunity  quantum  efficiencies,  gain  fluctuations,  shot  noise,  dark  current,  and 
thermal  noise.  Thus,  given  a  statistical  description  of  the  system  operation  as  well  as  the 
statistical  models  for  the  component  devices,  one  can  utilize  tools  from  statistical  optics  to 
establish  the  statistics  of  the  system  output  signal.  Optimal  detection-  and  estimation-theoretic 
techniques  can  then  be  applied  to  these  statistics  in  an  effort  to  improve  the  accuracy  of  the 
processor. 

In  this  paper,  we  first  review  our  past  work  on  statistical  analysis  and  modeling  of 
optical  processors,  establishing  the  overall  system  output  statistics  for  a  number  of  interesting 
device  combinations.  We  then  provide  a  quantitative  definition  of  accuracy  for  analog  optical 
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processors,  first  from  a  detection-theory  perspective  and  then  from  an  estimation-theory 
perspective.  With  the  availability  of  the  output  signal  probability  density  function  (PDF),  the 
highest  achievable  accuracy  for  each  special  case  can  then  be  determined. 


2.  Statistical  analysis  and  modeling 


The  starting  point  for  our  efforts  was  a  rigorous  statistical  analysis  of  a  generic  three-plane 
optical  processor  whose  architecture  is  the  backbone  of  a  wide  class  of  systems  including 
optical  correlators,  optical  interconnects,  and  optical  linear  algebra  processors  [1].  We 
established  the  statistics  of  the  output  signal  for  this  processor  without  committing  ourselves 
to  a  specific  set  of  sources,  modulators,  and  detectors.  Specifically,  we  found  the  conditional 
characteristic  function  of  the  detector  output  voltage  to  have  the  form 


M 


V\R,V 


(co)  =  exp 


j Pq(^)  J  [ar(r)  +  p]{exp[ycoe^/(r -x)]  -  l}dT  d^  - 


(1) 


where  e  is  the  electronic  charge,  q  is  the  random  gain  in  the  photodiode  with  the  PDF  Pgiq), 
f{t)  is  the  photon-to-voltage  impulse  response  of  the  detection  and  post-processing  electronics, 
r{t)  is  the  stochastic  rate  process  due  to  the  incident  field,  p  is  the  random  dark  excitation 

rate,  and  Oy  is  the  variance  of  the  Gaussian  zero- mean  thermal  noise  voltage. 

We  then  proceeded  to  insert  statistical  device  models  into  our  general  equations, 
obtaining  system  output  statistics  for  various  combinations  of  sources,  modulators,  and 
detectors  [2].  Specifically,  we  considered  classical  sources  such  as  laser  and  light-emitting 
diodes,  and  popular  detectors  such  as  p—i—n  and  avalanche  diodes.  Since  statistical  models 
for  most  SLMs  are  currently  unavailable,  we  considered  an  ideal  static  device  as  well  as  one 
with  hypothetical  random  transmittance.  Finally,  near-  and  far-field  free-space  propagation 
were  assumed  between  processor  planes.  Table  1  summarizes  the  various  statistical  and 
deterministic  device  models  used  [3-7]. 

The  overall  system  modeling  was  performed  in  stages  of  increasing  sophistication  [2]. 
We  started  with  an  idealized  system  where  the  only  realistic  components  were  the  sources, 
thus  reproducing  well-known  results.  We  then  moved  on  to  consider  realistic  detectors  in 
combination  with  realistic  sources.  The  next  step  was  to  bring  in  the  randomness  due  to  the 
use  of  a  realistic  SLM.  Finally,  the  most  general  form  of  the  processor  was  reached  by 
including  background,  crosstalk,  and  diffraction  effects. 

For  each  case,  we  first  calculated  the  statistics  (means,  mutual  coherence  functions, 
and  PDFs)  of  the  field  incident  on  the  detectors  in  terms  of  those  of  the  sources  and  the  SLM 
in  the  processor.  These  intensity  statistics  were  then  used  to  derive  the  statistics  of  the  rate 
process  r(r).  Finally,  the  output  signal  statistics  were  obtained  by  removing  the  conditioning 
on  r(0.  In  the  ideal  detector  case,  we  obtained  the  well-known  negative-binomial  and  multi¬ 
fold  Laguerre  PDFs  for  thermal  and  laser  sources,  respectively  [4],  while  for  realistic  detector 
cases,  we  were  able  to  invoke  the  Gaussian  approximation  obtained  by  the  Maclaurin 
expansion  of  the  complex  exponential  in  Eq.  (1)  [8].  The  results  for  the  realistic  SLM  case 
were  then  readily  obtained  from  these  expressions  by  simple  averaging. 


3.  Accuracy  limitations  and  enhancement 

Computational  accuracy  of  a  processor  can  best  be  specified  in  terms  of  the  signal  resolution 
it  affords  at  its  output  while  simultaneously  satisfying  an  average  probability-of-error  criterion. 
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Table  1.  Description  of  statistical  and  deterministic  device  models. 


COMPONENT 

DEVICE 

MODEL 

Sources 

Laser  diodes 

Phase-diffused  coherent  and  narrow-band 
chaotic  superposition  as  the  output  of  a 
nonlinear  van  der  Pol  oscillator  in  the  quasi- 
linear  regime 

Light-emitting 

diodes 

Circular  complex  Gaussian  random  process 
due  to  independent,  spatially  uniform, 
temporally  Poisson,  homogeneously-broadened 
emissions 

Spatial  light 
modulator 

Static  screen 

Constant  transmittance  due  to  lack  of  spatio- 
temporal  material  fluctuations 

Random  screen 

Circular  complex  Gaussian  random  process 
due  to  transmittance  and/or  polarization 
fluctuations  in  the  medium 

Detectors 

Ideal  photon 
counter 

Unity  gain,  rectangular  impulse  response,  no 
dark  or  thermal  noise 

p-i-n  diodes 

Constant  dark  excitation  rate,  unity  gain, 
exponential  (op-amp)  impulse  response, 
Gaussian  thermal  noise 

Avalanche  diodes 

Constant  dark  excitation  rate,  Yule-Furry 
random  gain,  exponential  (op- amp)  impulse 
response,  Gaussian  thermal  noise 

Propagation 

structures 

Near-field  free 
space 

Dirac-delta  point-spread  functions  due  to 
geometrical-optics  regime  propagation 

Far-fleld  free 
space 

Quadratic-phase  point-spread  functions  due  to 
Fresnel-regime  propagation 

Quantitatively,  this  signal  resolution  can  be  defined  in  terms  of  the  maximum  number  of 
identifiable  signal  levels  L  within  the  signal  dynamic  range  or,  equivalently,  the 

maximum  number  of  bits  n,  where  n  =  It  is  trivial,  yet  crucial,  to  recognize  that,  in  a 

fixed  dynamic  range,  accuracy  (i.e.,  number  of  levels  or  bits)  and  precision  (i.e.,  probability 
of  error  per  level  or  bit  error  rate),  both  of  which  should  be  specified  for  a  meaningful 
expression  of  processor  performance,  exhibit  an  inverse  relationship. 

The  determination  of  the  maximum  number  of  resolvable  signal  levels  involves  the 
optimal  quantization  of  the  signal  dynamic  range.  This  problem  can  be  solved  either 
iteratively  by  the  Lloyd-Max  algorithm  or  recursively  by  adaptive  dynamic  programming  [9]. 
In  the  case  of  signal-independent  noise,  the  solution  is  trivial  since  uniform  quantization 
proves  to  be  the  optimal  scheme,  resulting  in  uniformly  distributed  signal  levels  with  equal- 
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size  decision  regions  centered  at  these  levels  and  decision  thresholds  placed  halfway  between 
consecutive  levels.  In  our  case,  the  problem  is  complicated  by  the  fact  that  the  noise  at  the 
processor  output  is  signal-dependent,  calling  for  a  more  sophisticated  quantization  scheme. 

Formally,  for  a  given  maximum  tolerable  average  probability  of  error  per  signal  level, 
denoted  and  for  equal  a  priori  probabilities  P(v,)  =  1/L,  i  =  1,2,  ...,  L,  the  maximum 
attainable  accuracy  can  be  found  by  solving  for  the  maximum  value  of  L  in  the  equation 


■  y  ^  r 


=  p  , 


where  the  choice  of  signal  levels  v,,  i  =  1,2,  L,  and  decision  thresholds  z,-,  i  =0,  1,  L, 
subject  to  the  constraints  =  Zq  <  <  Zj  <  V2  •••  ^  <  z^^^  <  <  z^  =  comprise  the 

optimal  quantization  scheme  [10]. 

Alternatively,  we  can  formulate  the  problem  in  the  context  of  parameter  estimation 
theory  by  ascribing  an  a  priori  parameter  PDF  p{ms)  to  the  signal  we  wish  to  estimate.  Upon 
the  observation  of  k  samples  of  the  output  signal,  creating  the  sample  vector  v  with  a  joint 
PDF  p(y  Im^),  we  can  form  the  a  posteriori  parameter  PDF 


p(y|m,)p(m,) 

p(m^  I  v)  =  -j - 

p(v  |m^)p(m^)dm^ 


(3) 


which  we  ideally  expect  and  desire  to  satisfy  p(m^  I  v)  ^  S  (f^s  "^5)  as  o®.  Here  is 
the  true  value  of  the  signal  [10].  In  this  approach,  accuracy  can  be  quantified  by  the 
Cramdr— Rao  lower  bound  on  the  variance  of  the  estimate  [11],  which  offers  a  tradeoff 
between  accuracy  and  speed. 

In  both  approaches,  the  difficulties  due  to  the  signal  dependence  of  noise  at  the  output 
can  be  alleviated  by  the  use  of  suitable  normalizing  transforms  which  remove  from  the  noise 
power  the  dependence  on  the  signal  mean  [10,  12]. 
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Abstract 

Signals  with  significant  overlap  in  both  the  space  and  frequency  domains  may  have  little 
or  no  overlap  in  a  fractional  Fourier  domain.  Spatial  filtering  in  these  domains  may  allow 
us  to  eliminate  distortion  components  which  cannot  be  eliminated  in  the  ordinary  Fourier 
domain. 


1  Introduction 

Space-invariant  filtering  may  be  performed  by  multiplying  the  Fourier  transform  of  the  input 
signal  by  the  Fourier  transform  of  the  impulse  response.  Recently  we  have  discussed  how  various 
space-variant  operations  can  be  performed  by  multiplying  with  a  filter  function  in  a  fractional 
Fourier  domain  [1].  These  operations  can  be  realized  optically,  because  the  fractional  Fourier 
transform  can  be  realized  optically.  One  approach  is  based  on  the  use  of  quadratic  graded 
index  media  [2,  3,  4],  whereas  another  is  based  on  the  use  of  bulk  lenses  [5].  The  graded  index 
approach  is  closely  connected  to  the  definition  of  the  fractional  Fourier  transform  in  terms  of  its 
spectral  decomposition,  whereas  the  bulk  implementation  is  closely  connected  to  its  definition 
in  terms  of  its  linear  transform  kernel  [1]. 

The  many  mathematical  properties  of  the  fractional  Fourier  transform,  its  relation  to  the 
Wigner  space-frequency  distribution,  wavelet  transforms,  and  chirp  basis  expansions,  its  appli¬ 
cations  to  signal  processing,  and  issues  relating  to  its  optical  implementation  are  discussed  in 
the  references.  Due  to  limited  space,  we  will  here  content  ourselves  with  the  presentation  of 
two  examples  of  how  space- variant  filtering  can  be  achieved  by  applying  simple  binary  masks  in 
fractional  Fourier  domains.  Among  the  many  things  we  cannot  mention,  of  particular  interest 
is  correlation  in  fractional  Fourier  domains  and  its  application  to  pattern  recognition. 

2  Definition  of  the  fractional  Fourier  transform 

The  ath  order  fractional  Fourier  transform  of  a  function  /(•)  is  denoted  by  and  may 

be  defined  as  [1,  6]: 

^oo  Qi{Tf$/4-4>/2] 

=  j  "  I  g|^^jl/2  cot  0  -  'Ixx'  CSC  0  +  x''^  cot  0}]  f{x')  dx\ 

where  0  =  a7r/2  and  0  =  sgn(sin0).  Some  of  its  properties  are:  i.)  linearity;  ii.)  and 
correspond  to  the  identity  operation;  iii.)  corresponds  to  the  conventional  Fourier 
transform;  iv.) 

One  of  the  most  important  properties  states  that  performing  the  nth  fractional  Fourier 
transform  operation  corresponds  to  rotating  the  Wigner  distribution  by  an  angle  0  =  a(7r/2) 
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in  the  clockwise  direction.  We  are  unable  to  discuss  the  Wigiier  distribution  here,  although 
it  is  necessary  for  a  complete  understanding  of  the  filtering  examples  discussed  below.  The 
reader  is  encouraged  to  consult  [1]  and  the  references  given  there.  Roughly  speaking,  the 
Wigner  distribution  of  a  function  /(•),  denoted  by  can  be  interpreted  as  a  function 

that  indicates  the  distribution  of  the  signal  energy  over  space  x  and  frequency  u.  Defining 
the  rotation  operator  for  two-dimensional  functions,  corresponding  to  a  counterclockwise 
rotation  by  4>,  the  property  mentioned  above  can  be  expressed  as  Wjr«[y](x,  z/)  =  R_^W/(x,  z/). 
Another  version  of  this  property  [7]  is  z/)]  =  |.7^“[/]P,  where  the  operator  72^  is  the 

Radon  transform  evaluated  at  the  angle  (p.  The  Radon  transform  of  a  two-dimensional  function 
is  its  projection  on  an  axis  making  angle  4>  with  the  x  axis. 

Applications  other  than  that  discussed  in  this  paper  may  be  found  in  the  references. 


3  The  fractional  Fourier  transform  in  optics 

3.1  Optical  implementation  of  the  fractional  Fourier  transform 

Analog  optical  implementations  of  the  fractional  Fourier  transform  have  already  been  presented. 
In  [1,  2,  3,  4]  we  discussed  the  fractional  Fourier  transforming  property  of  quadratic  graded 
index  media.  Lohmann  suggested  two  systems  consisting  of  thin  lenses  separated  by  free-space 
[5].  That  the  two  approaches  were  equivalent  and  represented  the  fractional  Fourier  transform 
as  defined  above  was  demonstrated  in  [8]. 

The  fact  that  the  fractional  Fourier  transform  can  be  realized  optically  means  that  the 
filtering  examples  discussed  below  can  also  be  realized  optically.  Experimental  results  may  be 
found  in  [9]. 

3.2  The  fractional  Fourier  transform  as  a  tool  for  analyzing  optical  systems 

In  [10,  11]  we  show  that  there  exists  a  fractional  Fourier  transform  relation  between  the  (appro¬ 
priately  scaled)  optical  amplitude  distributions  on  two  spherical  reference  surfaces  with  given 
radii  and  separation.  It  is  possible  to  determine  the  order  and  scale  parameters  associated 
with  this  fractional  transform  given  the  radii  and  separation  of  the  surfaces.  Alternatively, 
given  the  desired  order  and  scale  parameters,  it  is  possible  to  determine  the  necessary  radii 
and  separation. 

This  result  provides  an  alternative  statement  of  the  law  of  propagation  and  allows  us  to 
pose  the  fractional  Fourier  transform  as  a  tool  for  analyzing  and  describing  a  rather  general 
class  of  optical  systems. 

One  of  the  central  results  of  diffraction  theory  is  that  the  far-field  diffraction  pattern  is  the 
Fourier  transform  of  the  diffracting  object.  It  is  possible  to  generalize  this  result  by  showing 
that  the  field  patterns  at  closer  distances  are  the  fractional  Fourier  transforms  of  the  diffracting 
object  [10]. 

More  generally,  in  an  optical  system  involving  many  lenses  separated  by  arbitrary  dis¬ 
tances,  it  is  possible  to  show'  that  the  amplitude  distribution  is  continuously  fractional  Fourier 
transformed  as  it  propagates  through  the  system.  The  order  «(’■)  of  the  fractional  transform 
observed  at  the  distance  z  along  the  optical  axis  is  a  continuous  monotonic  increasing  function. 
As  light  propagates,  its  distribution  evolves  through  fractional  transforms  of  increasing  ordeis. 
Wherever  the  order  of  the  transform  a(z)  is  equal  to  4j  -f-  1  for  any  integer  w'e  observe  the 
Fourier  transform  of  the  input.  Wherever  the  ordei’  is  ecpial  to  4j  +  2.  we  observe  an  inverted 
image,  etc.  [10,  11] 
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Figure  1:  Example  1  (left)  and  example  2  (right) 


4  Filtering  examples 

Consider  the  signal  exp[~7r(a;  -  4)^]  distorted  additively  by  exp(-^>.^'^)rect(a;/16).  The  mag¬ 
nitude  of  their  sum  is  displayed  in  part  a.,  on  the  left  hand  side  of  the  figure.  These  signals 
overlap  in  the  frequency  domain  as  well.  In  part  b.,  we  show  their  a  =  0.5th  fractional  Fourier 
transform.  We  observe  that  the  signals  are  separated  in  this  domain.  The  chirp  distortion  is 
transformed  into  a  peaked  function  which  does  not  exhibit  significant  overlap  with  the  signal 
transform,  so  that  it  can  be  blocked  out  by  a  simple  mask  (part  c.).  Inverse  transforming  to  the 
original  domain,  we  obtain  the  desired  signal  nearly  perfectly  cleansed  of  the  chirp  distortion 
(part  d,). 

Now  we  consider  a  slightly  more  involved  example  in  which  the  distorting  signal  is  also  real. 
The  signal  exp(— ttx^)  is  distorted  additively  by  cos[27^(.^’^/2-4.^‘)]rect(.^78),  as  shown  in  part  a., 
on  the  right  hand  side  of  the  figure.  The  a  =  0.5th  transform  is  shown  in  part  b.  One  of  the 
complex  exponential  chirp  components  of  the  cosine  chirp  has  been  separated  in  this  domain 
and  can  be  masked  away,  but  the  other  still  distorts  the  transform  of  the  Gaussian.  After 
masking  out  the  separated  chirp  component  (not  shown),  we  take  the  a  =  -1st  transform 
(which  is  just  an  inverse  Fourier  transform)  to  arrive  at  the  a  =  —0.5th  domain  (part  c.). 
Here  the  other  chirp  component  is  separated  and  can  be  blocked  out  by  another  simple  mask. 
Finally,  we  take  the  0.5th  transform  to  come  back  to  our  home  domain  (part  d.),  where  we 
have  recovered  our  Gaussian  signal,  with  a  small  error. 

The  examples  above  have  been  limited  to  chirp  distortions  which  are  particularly  easy  to 
separate  in  a  fractional  Fourier  domain  (just  as  pure  harmonic  distortion  is  particularly  easy  to 
separate  in  the  ordinary  Fourier  domain).  However,  it  is  possible  to  filter  out  more  general  types 
of  distortion  as  well.  In  some  cases  this  may  require  several  consecutive  filtering  operations  in 
several  fractional  domains  of  different  order  [1].  There  is  nothing  special  about  our  choice  of 
Gaussian  signals  other  than  the  fact  that  they  allow  easy  analytical  manipulation.  Also,  there 
is  nothing  special  about  the  0.5th  domain.  It  just  turns  out  that  this  is  the  domain  of  choice 
for  the  examples  considered  above. 

In  the  above  examples  we  have  demonstrated  that  the  method  works,  but  did  not  discuss 
what  led  us  to  transform  to  a  particular  domain  and  what  gave  us  the  confidence  that  doing  so 
will  get  rid  of  the  distortion.  This  becomes  very  transparent  once  one  understands  the  relation¬ 
ship  between  the  fractional  Fourier  transform  and  the  Wigner  distribution.  This  relationship, 


288 


as  well  as  the  general  philosophy  behind  such  filtering  operations  is  discussed  in  [1]. 


5  Conclusion 

What  we  know  as  the  space  and  spatial  frequency  domains  are  merely  special  cases  of  fractional 
domains.  These  domains  are  indexed  by  the  parameter  a.  The  representation  of  a  signal  in  the 
ath  domain  is  the  ath  fractional  Fourier  transform  of  its  representation  in  the  a  =  0th  domain, 
which  we  define  to  be  the  space  domain.  The  representation  in  the  a  =  1st  domain  is  the 
conventional  Fourier  transform.  If  we  set  up  a  two-dimensional  space,  called  the  Wigner  space, 
such  that  one  axis  (x)  corresponds  to  the  a  =  0th  domain  (the  conventional  space  domain)  and 
the  other  (u)  to  the  a  =  1st  domain  (the  conventional  spatial  frequency  domain),  then  the  ath 
domain  corresponds  to  an  axis  making  an  angle  (p  —  ai: j'l  with  the  x  axis. 

A  desired  signal  and  noise  may  overlap  in  both  conventional  space  and  frequency  domains, 
but  not  in  a  particular  fractional  domain.  Even  when  this  is  not  the  case,  spatial  filtering  in  a 
few  fractional  domains  in  cascade  may  enable  the  elimination  of  noise  quite  conveniently.  It  is 
possible  to  implement  these  operations  optically,  as  well  as  with  a  fast  digital  algorithm. 

It  is  a  pleasure  to  acknowledge  the  contributions  of  A.  W.  Lohmann  of  the  University  of 
Erlangen-Niirnberg  in  the  form  of  many  discussions  and  suggestions. 
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References 

[1]  H.M.  Ozaktas,  B.  Barshan,  D.  Mendlovic  and  L.  Onural,  /.  Opt.  Soc.  Am.  A,  1 1:547-559, 1994 

[2]  H.M.  Ozaktas  and  D.  Mendlovic,  Opt.  Commun.,  101:163-169,  1993. 

[3]  D.  Mendlovic  and  H.M.  Ozaktas,  J.  Opt.  Soc.  Am.  A,  10:1875-1881, 1993. 

[4]  H.M,  Ozaktas  and  D.  Mendlovic,  J.  Opt.  Soc.  Am.  A,  10:2522-253 1, 1993. 

[5]  A.W.  Lohmann,  J.  Opt.  Soc.  Am.  A,  10:2181-2186, 1993. 

[6]  A.C.  McBride  and  F.H.  Kerr,  IMA  J.  Appl.  Math.,  39:159-175, 1987. 

[7]  A.W.  Lohmann  and  B.H.  Soff^,  Technical  Digest  of  the  1993  Annual  Meeting  of  the  OSA,  p.l09, 
1993. 

[8]  D.  Mendlovic,  H.M.  Ozaktas  and  A.W.  Lohmann,  “Graded  index  fibres,  Wigner  distribution 
functions  and  the  fractional  Fourier  ttansforms”,  to  be  published  in  Appl.  Opt.. 

[9]  R.G.  Dorsch,  A.W.  Lohmann,  Y.  Bitran,  D.  Mendlovic  and  H.M.  Ozaktas,  “Chirp  filtering  in  the 
fractional  Fourier  domain:  exf^rimental  results”,  to  be  pubUsbed  in  Appl.  Opt.. 

[10]  H.M.  Ozaktas  and  D.  Mendlovic,  “The  fractional  Fourier  transform  as  a  tool  for  analysing  beam 
propagation  and  spherical  mirror  resonators”,  to  be  publishol  in  Opt.  Lett.. 

[1 1]  H,M.  Ozaktas  and  D.  Mendlovic,  “Fractional  Fourier  Optics”,  submitted  to  /.  Opt.  Soc.  Am.  A. 


Inst.  Phys.  Conf.  Ser.  No  139:  Part  III 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  British  Crown  Copyright 


289 


A  novel  form  of  incoherent  optical  correlator 


M  F  Lewis 

DRA  Malvern,  St.  Andrews  Road,  Malvern,  Worcs.  WRI4  3PS,  UK 

Abstract.  Much  work  on  digital  signal  processing,  DSP,  and  optical  signal  processing,  OSP, 
is  aimed  at  pattern  recognition.  Here  we  describe  a  simple  and  rugged  but  promising  hybrid 
digital/incoherent  optical  approach  to  this  problem. 


1.  Introduction 

The  problem  of  identifying  the  presence  of  a  given  pattern  in  a  scene  is  superficially  similar 
to  that  of  identifying  a  given  radio  or  radar  signal  in  a  noisy  environment.  The  optimum 
solution  to  the  latter  problem  was  devised  in  WW2  and  takes  the  form  of  either  a  correlator 
or  a  matched  filter,  appropriately  modified  if  the  background  noise  does  not  have  a  white 
spectrum.  For  this  reason  the  first  and  many  subsequent  approaches  to  pattern  recognition 
have  been  based  on  these  components.  However  in  reality  there  are  numerous  differences 
between  these  superficially  similar  problems.  For  example,  the  electronic  problem  is  one¬ 
dimensional  (time)  while  in  pattern  recognition  we  need  to  process  a  two-dimensional 
projection  of  a  three-dimensional  object,  often  with  no  a  priori  knowledge  of  its  scale  or 
orientation.  In  addition  it  turns  out  that  the  background  spectrum  is  never  white.  Much  of 
the  research  effort  over  the  past  thirty  years  has  sought  to  overcome  such  complications  of 
the  pattern  recognition  problem.  The  work  described  here  is  a  revisit  of  an  earlier 
incoherent  optical  correlator,  but  with  the  incorporation  of  new  features  to  improve  its 
performance  in  various  respects.  In  particular  we  describe  a  hybrid  OSP/DSP  processor 
incorporating  spectral  whitening  which  exploits  each  technology  to  advantage,  and  a  multi¬ 
channel  version  of  this  device  for  use  with  scrolling  input  data/scenes. 


2.  Summary  of  new  work  on  incoherent  correlators 

A  simple  and  elegant  incoherent  correlator  is  described  in  various  standard  textbooks. 

It  operates  by  a  shadow  casting  process  based  on  geometrical  optics,  as  shown  schematically 
in  Figure  1.  As  it  stands  this  setup  is  non-ideal  because  all  objects  correlate  with  a  given 
reference  to  some  extent,  the  output  deriving  from  optical  intensity  which  is  positive 
definite.  This  difficulty  arises  because  the  (spatial)  spectrum  of  the  scenery  is  not  "white". 
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Figure  1.  Schematic  incoherent  optical  correlator 


Figure  2.  Modified  incoherent  optical  correlator 
incorporating  edge-enhancement 


In  particular,  because  light  intensity  is  positive  definite  the  spectrum  is  always  peaked  at 
zero  spatial  frequency.  According  to  the  earlier  analysis  of  temporal  signals,  the  ideal 
processor  should  "prewhiten"  this  spectrum  prior  to  matched  filtering.  Computer  simulations 
show  that  the  consequence  of  prewhitening  is  to  require  scenes  and  references  which  are 
essentially  real  edge-enhanced  versions  of  the  originals.  This  is  agreeable,  as  it  means  that 
the  original  apparatus  can  he  employed;  Figure  2  shows  a  version  similar  to  the  one  we 
have  used  in  which  the  input  scene  is  edge-enhanced  in  the  time  domain  at  video  rates  using 
DSP  chips.  It  should  be  emphasised  that  true  edge  enhancement  requires  bipolar  operation, 
requiring  duplication  of  part  of  the  processor.  To  date  we  have  not  implemented  this  for 
reasons  of  simplicity,  and  because  the  use  of  a  single-channel  allows  identification  of  dark 
signals  on  a  light  background  as  well  as  light  signals  on  a  dark  background. 

Various  other  points  emerge  from  the  computer  simulation,  eg  we  have  identified 
the  importance  of  corners  as  well  as  edges  (for  objects  which  possess  corners)  and  have 
found  that  if  the  target  possesses  some  unique  feature  then  one  should  emphasise  this  feature 
in  the  correlation  process. 

The  scheme  shown  in  Figure  2  has  been  built  and  demonstrated,  but  with  the  scene 
and  reference  transposed,  the  former  employing  a  Panasonic  LC  TV  input  and  the  latter 
being  a  fixed  edge-enhanced  transparency.  Using  artificial  scenery  this  has  enabled  us  to 
identify  a  given  target  in  cluttered  and  noisy  backgrounds.  An  important  issue  in  the  design 
is  the  width  of  the  edges  to  be  used  in  practice,  since  mathematically  these  are  vanishingly 
small.  We  have  chosen  to  implement  the  edges  of  the  central  component  with  quite  wide 
slits  (of  order  0.5mm  in  our  setup)  as  this  conveys  various  benefits  on  the  system: 

(a)  It  allows  most  light  through  to  the  output  stage. 

(b)  It  gives  the  correlation  process  some  tolerance  to  scale  and  orientation,  an  important 
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issue  in  pattern  recognition.  When  wider  edges  are  used,  the  effect  is  similar  to  that 
obtained  using  a  synthetic  discriminant  function  (SDF)  filter^'^f 

(c)  It  minimises  the  deleterious  effect  of  optical  diffraction  through  narrow  slits,  which 
degrades  the  correlation  peak.^^^ 

An  agreeable  feature  of  edge-enhancement  is  that  it  causes  a  sharpening  of  the 
correlation  peak,  enabling  the  centre  of  the  object  to  be  located  precisely.  This  is  evident 
from  a  comparison  of  the  output  planes  of  Figures  1  and  2.  A  similar  phenomenon  occurs 
in  certain  coherent  correlators  which  deliberately  or  accidentally  incorporate  a  degree  of 
spectral  prewhitening,  such  as  those  employing  "phase-only  matched  filters".  However,  such 
systems  are  very  dependent  on  a  close  match  between  reference  and  scene.  A  useful 
compromise  is  to  use  wider  slits/sources  to  implement  the  edges,  as  discussed  above.  A 
further  advantageous  feature  of  digital  edge-enhancement  is  that  it  can  eliminate  the  effects 
of  non-uniform  illumination  in  the  scene,  eg  a  target  in  the  shade  of  a  tree  can  be  made  to 
appear  just  as  bright  as  a  fully  illuminated  target  if  it  exhibits  the  same  pattern  of  edges. 

Incoherent  correlators  are  greatly  superior  to  their  coherent  counterparts  in  respect 
of  size,  cost  and  ruggedness  and  can  be  realised  in  a  variety  of  optical  configurations.  A 
potent  implementation  might  employ  an  LC  SLM  to  input  the  edge-enhanced  scene  at  video 
rates,  and  an  array  of  surface  emitting  lasers,  which  can  be  modulated  at  GHz  rates^^^,  to 
input  multiple  references  in  very  rapid  succession,  eg  >  100  per  frametime,  to  cover  further 
variations  in  scale  and  orientation.  The  incoherent  correlator  possesses  a  number  of 
additional  advantages  over  coherent  optical  correlators.  Two  of  the  more  important  of  these 
are: 

(1)  Unlike  the  coherent  correlator,  the  incoherent  correlator  can  readily  employ  references 
designed  to  mimic  the  colour  content  of  a  known  target.  (Colour  could  also  be  used  to 
implement  the  bipolar  processor  mentioned  earlier  if  so  required). 

(2)  Optical  correlators  exhibit  invariance  to  object  location  in  two  dimensions.  However 
there  are  many  circumstances  in  which  it  is  only  necessary  to  have  invariance  in  one 
dimension,  eg  when  dealing  with  scrolling  text,  when  items  translate  along  a  production 
line,  or  when  visible,  ir,  or  radar  images  are  gathered  from  a  moving  aircraft.  In  such 
circumstances  the  incoherent  correlator  can  exploit  the  second  dimension  to  good  effect.  In 
the  laboratory  we  have  demonstrated  this  by  employing  the  second  dimension  to  seek  three 
different  objects,  namely  the  letters,  N,  O  and  A.  The  apparatus  used  to  achieve  this  is 
shown  schematically  in  Figure  3.  The  principal  differences  from  Figures  1  and  2  are  (a) 
illumination  of  the  scene  with  a  narrow  "line"  source  and  cylindrical  lens,  the  output  of 
which  diverges  horizontally  but  is  constrained  vertically,  and  (b)  optics  which  separates  the 
outputs  of  the  three  channels  in  the  output  plane;  a  segmented  lens  was  used  for  simplicity. 
The  operation  of  this  structure  is  illustrated  in  Figure  4.  The  central  photograph  shows  the 
three  separate  correlation  peaks  arising  if  the  letters  N,0,A  are  input  to  the  correct 
channels.  The  other  four  photographs  show  the  lower-level  cross-correlations  that  occur  if 
the  input  pattern  is  stepped  up/down  by  1/2  steps.  The  camera  exposure  was  the  same  for 
all  five  photographs. 
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Figure  3.  Schematic  side-view  of  multi-channel  incoherent  optical  correlator 
for  use  with  scrolling  input  scenes 


3.  Conclusion 

This  paper  describes  various  modifications  to  a  simple  and  rugged  incoherent  correlator. 
The  incorporation  of  edge-enhancement  leads  to  greatly  improved  inter-class  discrimination, 
while  a  further  modification,  the  multi-channel  correlator,  is  particularly  appropriate  to  the 
processing  of  scrolling  data.  Acceptable  geometries  for  this  type  of  correlator  are  discussed 
in  another  paper  at  this  conference. 
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Abstract.  We  optically  experiment  a  joint  transform  correlator  (JTC)  where  the  input 
image  is  replaced  by  a  pseudo-inverse  synthetic  discriminant  function  (SDF)  filter.  This  latter 
allows  us  to  ensure  a  practical  invariance  over  a  10®  range.  Since  our  input  amplitude  SLM 
cannot  display  other  kinds  of  SDFs,  we  propose  to  post-process  them  with  a  DBS-based  method 
and  provide  simulations. 


1.  Introduction 

Optical  correlation,  and  especially  joint  transform  correlation  [1],  has  been  known  as  an 
efficient  alternative  to  its  electronical  counterpart  since  the  1960's  and  compact 
implementations  of  joint  transform  correlators  (JTCs)  have  recently  been  presented.  But 
correlation  is  inherently  rotation-  and  scale-  variant  and  a  traditional  correlation  operation 
cannot  perform  pattern  recognition  in  most  practical  cases.  In  order  to  overcome  this 
limitation,  Javidi  [2]  has  proposed  and  simulated  a  JTC  using  synthetic  discriminant  function 
(SDF)  filters.  Juvells  et  al.  [3]  have  even  provided  optical  experimental  results,  but  have  not 
tested  the  correlator  training  ability  to  recognize  intermediate  rotated  or  scaled  views.  In  this 
paper,  we  propose  such  experimental  results.  But  our  input  system  proves  unable  to  input  all 
types  of  SDF  filters  because  it  cannot  display  negative  values.  Instead  of  replacing  our  input 
device  with  a  more  sophisticated  one  allowing  us  to  control  both  amplitude  and  phase  [4],  we 
propose  to  postprocess  our  filters  with  a  Direct  Binary  Search  (DBS)-based  method.  This 
method  is  of  general  purpose  and  we  show  that  it  allows  us  to  display  negative-valued  images. 


2.  Basic  Theory 

The  basic  principle  of  joint  transform  correlation  [1]  is  now  widely  known,  and  we  shall  not 
present  it.  The  need  of  ensuring  distortion  invariances  appears  clearly:  on  our  correlator,  at  the 
CERT  Optical  Science  Department,  an  in-plane  rotation  of  one  degree  or  a  ±  5%  scale  change 
causes  a  dramatic  energy  loss  of  more  than  3  dB.  To  achieve  an  invariance  for  such  a 
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phenomenon,  we  decided  to  use  the  SDF  technique  [5],  a  general  method  to  achieve  any 
invariance.  In  this  paper,  we  decided  to  limit  ourselves  to  in-plane  rotations:  in  this  case,  the 
SDF  technique  consists  in  replacing  the  reference  image  with  several  rotated  views.  The 
correlation  peak  height  will  be  equal,  provided  the  input  view  belongs  to  the  training  set.  The 
point  is  to  test  intermediate-rotated  input  views,  between  those  of  the  training  set. 

The  simplest  SDF  filter,  a  linear  combination  of  views  from  the  training  set,  is  called 
pseudo-inverse  SDF  (PI-SDF).  Other  types  of  filters  have  then  been  derived  from  it:  the 
minimum  variance  SDF  [6]  (MV-SDF)  which  is  coloured  noise  resistant,  the  minimum 
average  correlation  energy  [7]  SDF  (MACE-SDF)  which  sharpens  the  correlation  peak  and  the 
optimal  trade-off  SDF  [8]  (OT-SDF)  which  combines  the  previous  two  qualities. 


3.  Preliminary  simulations 

We  tested  the  four  kinds  of  SDFs  mentioned  above,  in  order  to  verify  their  training  ability.  For 
each  type  of  filter,  we  calculated  four  filters,  each  of  them  containing  6  views  of  F-4  or  F-18 
planes,  with  a  various  angular  step  between  two  consecutive  views:  0.5°,  1°,  2°  or  4°. 

The  simulations  give  the  same  results  as  those  in  the  literature  for  the  4f  architecture:  the 
PI-SDF  training  ability  is  average:  there  is  a  marked  difference  between  the  images  belonging 
to  the  training  set  and  images  of  the  same  airplane,  but  not  in  the  training  set.  The 
discrimination  between  the  plane  of  the  training  set  and  another  lifelike  plane  is  unambiguous 
(the  ratio  of  the  correlation  peak  heights  is  over  4).  The  MACE-SDF  only  recognizes  the  views 
it  contains  and  so  proves  useless,  and  the  MV-SDF  produces  a  very  poor  discrimination, 
sometimes  no  discrimination  at  all.  As  expected,  the  OT-SDF  produces  intermediate  results, 
and  is  worth  testing  experimentally. 


4.  Experiments 

Such  experiments  had  already  been  reported  [9],  but  never  with  a  joint  transform  correlator. 
The  experimental  setup  was  implemented  on  the  CERT  Optical  Science  Department 
demonstrator.  The  input  SLM  of  the  correlator,  a  GEC-Marconi  liquid  crystal  light  valve 
(LCLV),  perform  an  amplitude  modulation  [10]. 

We  tried  to  test  the  four  kinds  of  filters  previously  simulated.  Unfortunately,  MACE- 


(a) 


(b) 
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Simu.  autocorr. 
Simu.  xcorr. 
Exp.  autococr. 
Exp.  xcorr. 


4  6 

Rotation  angle  (“) 


Fig.  1  .Simulated  and  experimental  correlation  peak  heights  for  Pl-SDFs  whose  views  differ  from  2°  (a) 
or  4°  (b).  In  all  figures  (1(a),  1(b)  and  2),  crosses  denote  autocorrelation  when  the  irput  view  belongs 
to  the  training  set  and  'xcorr'  curves  are  relating  to  cross-  correlations. 
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SDFs,  MV-SDFs  and  so  OT-SDFs  proved  unusable:  they  are  non-positive,  so  produce  a  non¬ 
zero  background  on  our  amplitude  SLM  and  especially  their  dynamic  range  is  high.  The  input 
setup  cannot  display  both  a  large  dynamic  range  and  tiny  details. 

We  experimented  6- view  PI-SDFs,  with  respective  angular  steps  of  0.5°,  1°,  2°  and  4° 
between  two  consecutive  planes.  For  a  0.5,  1  or  2°  difference,  the  output  response  is  quite 
uniform  over  the  angular  range:  +/-  7  %  (see  fig.  1(a)  for  the  2°  step).  The  training  ability  is 
good:  we  cannot  even  find  any  difference  between  the  images  belonging  to  the  training  set  and 
images  outside  the  training  set.  The  discrimination  is  also  good  (the  ratio  of  the  peak  heights  is 
at  least  1:5).  For  a  4°  difference,  the  results  deteriorate  dramatically  (fig.  1(b))  and  are  no 
longer  in  good  agreement  with  the  simulations. 


5.  SimulatioDS 

To  overcome  the  incorrect  display  of  non-positive  images,  there  are  two  solutions:  to  post¬ 
process  the  images  in  order  to  fit  the  SLM  coding  domain,  or  to  replace  the  input  system  with 
a  more  sophisticated  one  [4].  We  chose  the  first  way,  and  decided  to  make  the  filters  fit  our 
input  device  at  best.  Juvells  et  al.  [3]  have  recently  proposed  to  code  the  filter  with  a  detour 
phase  technique  (derived  from  Lee's  method  [11]  of  generating  holograms  by  computer), 
which  allows  displaying  the  whole  filter  information  with  only  positive  values.  Such  methods 
have  been  extensively  studied  in  the  field  of  holography,  and  one  of  the  best  proved  to  be 
Direct  Binary  Search  (DBS)  [12,13]. 

This  iterative  method  was  originally  dedicated  to  binary  holograms,  Legeard  et  al  [14] 
have  extended  it  to  grey  level  holograms:  a  random  image  -the  "hologram"-  is  first  generated, 
and  its  FFT  -"the  reconstruction"-  is  then  computed.  An  error  function  is  calculated  by 
comparing  this  FFT  and  the  FFT  of  the  original  image.  The  grey  level  of  each  pixel  is 
modified  pixel  after  pixel  and  a  new  error  is  computed  each  time  between  the  new 
"reconstruction"  and  the  FFT  of  the  original  image.  If  the  new  error  is  smaller  than  the 
previous  one,  then  the  modification  of  the  pixel  is  maintained;  if  not  it  is  cancelled.  This 
iterative  process  goes  on  until  no  modifications  are  required. 

We  proposed  to  code  our  OT-SDF  filters  with  this  method,  with  a  number  of  grey  levels 
ranging  from  2  to  256.  The  filters  behaviour  has  been  simulated  and  results  normalized  in 
relation  to  the  post-processed  filter  energy  are  depicted  fig.  2.  The  difference  between  views 
belonging,  or  not,  to  the  training  set  is  less  marked  after  post-processing.  The  discrimination 
ability  is  maintained.  The  simulations  also  show  that  a  reduced  number  of  grey  levels  is 


Autocorr.  2 
Xcorr.  2 
Aulocorr.  16 
Xcorr.  16 
Autocorr.  256 
Xcorr.  256 


Fig.  2.  Simulated  correlation  peak  heights  for  OT-SDFs 
whose  views  differ  from  2®,  and  which  have  been  post- 
processed  with  DBS.  2,16  or  256  indicate  the  number  of 
grey  levels  in  the  filter. 
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sufficient,  with  hardly  any  difference  -invisible  on  a  graphics-  between  the  16-level  filter  and 
the  256-level  one.  This  allows  us  to  content  ourselves  with  filters  comprising  a  reduced 
number  of  grey  levels,  computationally  less  expensive. 

There  is  a  huge  advantage  of  such  a  method  over  a  detour  phase  one:  the  available  space- 
bandwidth  product  is  not  magnified.  But  on  the  other  hand,  the  computational  cost  is  high, 
since  this  method  is  iterative.  Furthermore,  as  in  the  holographic  case,  we  can  hope  that  the 
results  are  better,  as  far  as  error  is  concerned  [13]. 


6.  Conclusion 

We  have  implemented  a  joint  transform  correlator  using  synthetic  discriminant  function  filters. 
The  experiments  proved  that  the  MACE-SDF  and  the  MV-SDF  -and  so  the  OT-SDF-  cannot 
be  displayed  coirectly  on  an  amplitude  SLM.  But  the  PI-SDF  which  is  correctly  displayed, 
works.  We  showed  that  for  an  in -plane  rotation  -  provided  the  angular  step  between  the  views 
it  contains  is  not  too  high  -  this  filter  can  lead  to  an  unambiguous  discrimination,  for  any 
object  orientation  over  the  filter  angular  range:  it  showed  a  certain  training  ability  and 
especially,  that  it  could  be  optically  implemented,  without  too  much  degradation,  into  JTC 
architectures  whose  compact  examples  have  become  popular.  Further  simulations  concerning 
OT  filters  showed  that  provided  the  filters  were  post-processed  to  fit  SLM  constraints  (in  our 
case  an  amplitude  device),  they  could  be  implemented  into  JTC  architecture  without  losing 
their  inherent  qualities. 
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Guy  Lebreton 

GESSY,  Universite  de  Toulon,  B. P.132,  83957  La  Garde  cedex,  France 


Abstract.  For  invariant  pattern  recognition,  an  original  technique  using  a  simple  associative 
optical  memory  is  presented;  it  detects  the  orientation  of  any  object  before  identification. 
Then  the  implementation  of  another  associative  optical  memory  for  pattern  recognition 
invariant  to  scale  and  projections  is  discussed. 


1.  Introduction 

To  increase  the  computing  efficiency  with  moderate  data  flow  (Video  rate  typical),  the 
original  idea  of  the  hybrid  neural  network  architecture  (modified  Hopfield  type)  is  to  utilise 
a  non-linear  optical  amplifier  between  two  cascaded  optical  correlators,  the  first  one  giving 
the  weights  (inner  product  with  central  correlations),  the  second  yielding  the  weighted 
memory  output  for  electronic  thresholding  and  feed-back  to  the  optical  processor.  In  the 
implementation  proposed  at  OC92  [1],  the  high  capacity  programmable  optical  memory 
was  a  photothermoplastic  plate,  and  the  non-linear  amplifier  was  a  BGO  crystal  with  phase 
conjugation  (Fig.l).  It  was  shown  that  a  high  non-linearity  (of  the  order  4  at  least)  maintains 
the  dominant  influence  of  the  r(0)  peak  in  an  extended  window  required  for  shift  invariance 
[2]). 


Figure  1  :  double-correlator  optical  associative  memory  with  complex  input 
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Instead  of  performing  the  standard  outer  product,  computing  the  weighting  coefficients  as 
the  sum  of  central  correlation  peaks  between  the  input  image  f  and  the  memory  models  f^^ , 
the  output  of  the  first-stage  correlator  keeps  the  correlation  window,  to  provide  for  shift 
invariance.  The  second-stage  correlator  convolves  it  with  the  memories  for  the  auto- 
associative  reconstruction;  the  blurr  which  would  result  from  this  low-pass  filtering  is 
compensated  for  by  the  highly  non-linear  detection  (power  4  instead  of  the  power  2  used  in 
ref.2)  : 

m 

To  obtain  this  high  non-linearity,  the  photorefractive  crystal  should  be  used  twice,  but  this 
requires  too  much  optical  input  power.  The  new  version  proposed  will  replace  the  crystal 
with  a  non-linear  optically  addressed  ferroelectric  liquid  crystal. 

The  progresses  presented  here  do  not  concern  the  non-linear  optical  layer,  but  the  algorithms 
of  the  process,  and  should  be  also  of  interest  for  electronic  computing.  The  goal  was  to 
utilise  the  above  double-correlator  optical  associative  memory  for  two  successive  operations 
:  first,  find  the  orientation  of  an  unknown  input  object  and  rotate  it  as  a  pre-processing  stage; 
then,  identify  the  object  with  projection  and  scale  invariant  detection.  The  first  operation  has 
been  successfully  completed  and  will  be  fully  described.  The  second  operation  is  now  under 
investigation  and  actual  projects  for  its  implementation  will  be  discussed. 

2.  Rotation  pre-processing  before  projection-invariant  recognition. 

The  filter  previously  reported  [1]  for  finding  the  orientation  of  an  input  from  a  set  of  4 
aircrafts  was  the  one  (the  left  one  on  Fig. 2),  that  had  a  detectable  cross-correlation  with  all 
other  objects  of  this  small  data  base.  The  memory  contained  replicas  of  this  model,  rotated 
from  ten  degrees.  The  results  were  satisfactory  on  contour  objects,  but  were  restricted  to  one 
class  of  correlated  objects  and  required  edge  detection  pre-processing.  Then  a  simplified 
model  object  was  usedas  shown  in  Fig.2  below. 
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The  model  oriented  closer  to  the  direction  of  the  input  aircraft  is  reconstructed  after  a  few 
iterations,  but  it  appears  that  the  simplified  wings  are  not  useful.  After  their  suppression,  it 
was  found  that  the  two  lines  of  the  remaining  angle  in  the  model  could  induce  a  conflicting 
competition  to  select  the  orientation.  The  final  model  has  been  reduced  to  one  thick  line 
(2x50  pixels)  and  the  pre-processing  of  the  input  for  edge  detection  was  suppressed.  Then 
one  obtained  excellent  results  for  any  type  of  input  object,  quite  independent  of  distortions, 
as  illustrated  on  Fig.3. 


Fig. 3  :  detection  of  the  orientation  of  an  unknown  object 

The  selectivity,  tested  with  ellipses  as  input,  reaches  a  1  dot  difference  in  the  discrete  image. 
For  a  direction  exactly  between  two  nearest  recorded  angles  of  the  filter,  one  obtains  a  stable 
output  with  identical  values  for  the  nearest  angles,  if  all  rotated  replicas  have  exactly  the 
same  weight.  To  provide  for  this  condition  with  the  discrete  64x64  image,  a  longer  model 
line  was  used  to  compute  each  rotated  version  and  then  truncated  to  a  uniform  weight  of  100 
pixels. 

An  important  improvement  for  the  convergence  speed  has  been  to  program  an  adaptive 
thresholding,  to  increase  the  gain  when  the  competition  is  low  between  the  output  modes.  In 
a  standard  Hopfield  model,  the  threshold  plane  is  the  correlation  plane,  which  cannot  be 
accessed  here  in  the  double-correlator  architecture.  Thus  one  utilises  the  Parseval  Theorem, 
saying  that  the  correlation  peak  amplitude  equals  the  energy  of  the  signal.  Our  threshold 
criterion  is  then  the  sum  over  the  output  plane.  The  adaptive  threshold  %  shown  on  Fig.3 
was  adjusted  at  each  iteration  to  keep  only  in  the  new  reconstructed  image  1.1  times  the 
number  of  pixels  of  the  model  filter  ±10%. 
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3.  Projection  and  scale  invariant  pattern  recognition. 


After  this  pre-processing,  a  full  invariance  in  pattern  recognition  can  be  achieved  using  the 
same  associative  memory  with  a  scale  and  projection  invariant  filter.  The  interest  of  using 
the  associative  memory  architecture  instead  of  a  simple  correlation  is  to  enhance  the  signal 
to  noise  ratio  and  the  tolerance  to  distortions.  First  a  Log-Log  single  harmonic  filter  (2-D 
Mellin  transform)  was  tested:  it  is  invariant  to  projections  on  both  axes,  but  it  would  require 
a  Mellin  transform  on  the  real-time  input  as  in  previous  implementations  [3].  A  new  filter 
has  been  synthesised  recently,  by  doing  first  a  Mellin-radial  single  harmonic  filter  f^^, 
invariant  to  scale,  and  then  a  1-D  Log  Transform  f^R^  with  the  same  harmonic  N  [4]  : 

(r,e)  =  .|T(r,e). 

=  4^x,y).x‘™dx 

First  experiments  in  correlation  show  an  insufficient  discrimination  capability  for  inserting 
this  filter  in  the  associative  memory.  The  rejection  of  another  object  g  might  be  forced  with 
the  synthetic  discriminant  function  approach  : 

f^^j,(x,y)  =  |xf^’'^“''^[a.fR  (x,y)  +  b.gR(x,y)].  with  complex  numbers  a  and  b 

rr„^^(0,0)  ira]_rii 

solutions  of  '  K  ~  n 

But  the  experience  on  the  rotation  filter  shows  that  a  partial  correlation  from  contributing 
neighbours  in  the  memory  is  necessary  to  enable  energy  transfers  between  the  intermediate 
iterations,  so  we  should  allow  1/2  instead  of  0  for  the  cross-correlation.  This  solution  is 
under  investigation,  but  it  is  expected  that  the  b  coefficient  might  be  sensitive  to  distortions 
of  the  input.  Another  solution  currently  being  tested  is  using  only  the  Mellin-radial  single 
harmonic  filter  for  scale  invariance,  which  has  a  good  selectivity,  and  leaving  the  associative 
memory  to  provide  for  some  distortion  invariance.  But  the  Mellin-radial  harmonic  will 
never  retrieve  hidden  information,  such  as  specific  top-view,  profile  and  front-view 
contents:  this  information  will  have  to  be  introduced  as  a  hidden  layer  in  the  neural  network, 
as  illustrated  for  neural  network  classifiers  [5]. 


4.  Conclusion. 

Previously,  the  memory  content  for  orientation  detection  had  to  be  selective  for  one  class  of 
object,  with  limited  tolerance  to  distortions,  and  worked  only  on  contours  with  low  optical 
efficiency.  Now,  excellent  results  have  been  obtained  for  any  type  of  filled  objects,  using  as 
memory  a  simple  line  with  rotated  replicas. 

For  the  projection  invariance,  much  work  is  still  to  be  done,  but  the  harmonic  decomposition 
approach  is  expected  to  avoid  much  of  the  redundancy  of  a  synthetic  discriminant  filter  by 
providing  at  least  for  scale  invariance. 
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Abstract.  Mutually  orthogonal  pattern  distortions  are  handled  by  an  adap¬ 
tive  optical  recognition  system.  A  double  channel  system  is  presented  which 
implements  pattern  recognition  with  rotation,  scale  and  shift  invariance.  The 
overall  recognition  process  is  performed  efficiently  and  can  be  executed  in  real 
time. 


1.  Introduction 

The  recognition  process  is  based  on  a  two  stage  operation;  An  object  independent  de¬ 
termination  of  one  distortion  parameter  (the  scale,  in  the  example  presented  here)  and 
then  the  recognition  is  completed  by  a  shift  and  rotation  invariant  optical  correlator 
which  is  adapted  to  the  measured  parameter  (in  our  example  the  input  is  scaled  up  or 
down,  according  to  the  mesured  scale).  Thus,  complete  invariance  to  three  distortion 
parameters  is  achieved  by  the  combination  of  two  channels.  See  also  Fig,  1. 


2.  Scale  measurement  procedure 

The  overall  recognition  process  is  performed  in  two  stages  where  the  scale  measurement 
is  performed  first.  This  procedure  dictates  several  properties:  The  scale  measurement 
must  be  rotation  and  shift  invariant  since  the  object’s  orientation  and  position  are  not 
known  a  priori.  It  must  be  independent  of  the  object  as  long  as  the  object  belongs  to  the 
class  of  objects  to  be  recognized.  The  whole  process  must  be  easily  implemented  optically 
and,  finally,  it  must  be  fast  enough  for  efficient  operation  of  the  complete  system. 

Let  f{r,0)  be  the  input  function  and  F(p,4>)  its  Fourier  transform  (in  polar  co¬ 
ordinates).  When  the  scale  of  the  input  is  extended  by  a  factor  a,  the  energy,  using 
Parseval’s  theorem,  becomes; 

Fa  =  f  [  la^F(a/9,(^)|  pdpd(j)  =  a} Eq, 

Jo  Jo  '  ' 


(1) 
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Figure  1.  Schematic  configuration  of  the  overall  recognition  system. 


where,  Eq  is  the  energy  of  the  original  function  fir, 8).  Let  Ea{p)  he  the  energy  of  the 
object  in  the  radial  spatial  frequency  interval  0  <  p  <  p: 

/•Sir  o  fiCip  _ 

Ea{p)  =  j  j  |a^F(ap,(^)|  pdpd4>  =  |F(p,<^)|  pdpd(f)  (2) 

We  define  the  ratio  between  Ea{p)  and  the  total  energy  Ea  by: 

Ea{p)  Io"^So^\F{P^^)\"pdpd<P 
r(a,  P)  =  -^= 

Clearly,  T(a,  p)  depends  on  p  only  through  the  integration  limit.  Hence,  by  requiring  the 
value  of  T  to  assume  a  fixed  (reference)  value  To  independent  of  the  scale,  the  scale 
factor,  a,  can  now  be  easily  extracted  from  the  simple  relation: 


(4) 


where  uq  is  the  reference  scale  and  po  the  corresponding  radial  frequency. 

One  of  the  most  important  parameters  we  must  take  into  consideration  is  the 
operation  speed.  Therefore,  in  the  laboratory  system,  we  used  an  approach  based  on  the 
binary  search  algorithm,  see  [l].  In  this  way  for  an  input  object  of  IV  x  iT  pixels  the 
scale  is  found  after  no  more  than  log‘i{N)  steps. 


3.  Filter  Design 

We  use  a  two-stage  recognition  process  in  which  the  input  object  is  adapted  (re-scaled) 
before  the  correlation  is  made.  Thus,  theoretically,  in  the  correlation  channel  we  have 
to  use  a  rotation  and  shift  invariant  filter.  However,  due  to  the  errors  in  the  scale 
measurement  procedure,  there  is  a  scale  mismatch  between  the  filter  and  the  object. 
Conventional  filters  (and  especially  rotation  and  shift  invariant  filters),  cannot  operate 
properly  with  such  a  scale  mismatch.  This  is  the  main  reason  for  developing  a  new  filter 
which  is  invariant  to  a  limited  scale  range  as  well  as  remaining  fully  rotation  and  shift 
invariant.  The  scale-invariance  range  was  chosen  according  to  the  scale  measurement 
errors,  so  that  those  errors  are  fully  compensated  by  the  correlation  channel. 
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Figure  2.  The  POPOCSF  sensitivity  to  scale  in  comparison  with  that  of  the 
POCHF.  The  intensities  are  normalized  so  that  both  peaks  are  equal. 


The  filter  was  designed  using  the  POCS  (Projection  Onto  Convex  Sets)  algorithm 
and  will  be  called  the  POCS  filter  (POCSF).  The  requirements  that  are  explicitly  used 
in  the  design  procedure  are:  high  correlation  peak  intensity,  good  discrimination  ability 
(in  this  case  detect  the  letter  ’F’  while  rejecting  the  rest),  full  rotation  invariance,  the 
filter  being  a  passive  element,  and  finally,  limited  scale  invariance.  This  algorithm  and 
its  implementation  for  our  purposes  are  discussed  in  detail  in  [2]  and  [1],  respectively, 
and  hence  will  not  be  repeated  here. 

The  laboratory  implementation  of  the  optical  correlator  employs  SLM’s,  both  in 
the  input  and  in  the  Fourier  planes.  This  architecture  dictates  several  requirements  from 
the  filter,  including  high  light  efficiency,  high  correlation  peak  and  the  ability  to  record 
the  filter  on  the  SLM  which  has  a  limited  number  of  pixels.  A  phase-only  version  of  the 
designed  filter  (POPOCSF)  is  a  good  candidate  for  implementation.  The  invariant  scale 
range  ([ai,a2])  was  chosen  to  be  ±8%. This  range  fully  compensates  for  the  inaccuracies 
of  the  scale  estimation. 

Fig.  2  clearly  demonstrates  that  for  the  scale  range  required,  due  to  the  limited 
accuracy  of  the  scale  measurement,  the  POPOCSF  is  indeed  invariant  and  even  for  a 
scale  range  of  ±10%  the  peak  remains  above  95%  of  its  maximum  value.  The  phase-only 
CHF  (POCHF),  as  expected,  is  very  sensitive  to  scale  and,  obviously,  cannot  be  used 
in  the  proposed  system.  The  laboratory  correlation  system  was  based  on  a  4-f  optical 
correlator  in  which  both  filter  and  input  were  written  on  SLM’s.  Several  experiments 
were  performed  in  order  to  investigate  the  filter  characteristics,  including  discrimination 
and  sensitivity  towards  rotation  and  scale  changes  of  the  input  function.  As  it  can  be 
seen  from  Fig.  3.  the  letter  ’F’  was  recognized  while  the  letter  ’P’  was  rejected  with  a 
rejection  ratio  of,  approximately,  50%. 

The  experimental  performance  of  the  complete  system  was  tested  and  excellent 
results  were  obtained,  as  expected  from  the  results  presented  in  previous  sections. 


4.  Conclusions 

A  double  channel  optical  pattern  recognition  system  was  introduced,  implementing  dis¬ 
tortion  invariant  operation.  The  basic  principle  is  quite  general  and  can  be  extended  to 
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Figure  3.  Correlation  response  of  the  binary  POPOCSF  for  the  letters  ’F’ 
and  T’.  (a)  The  output  plane,  (b)  Cross  section  of  the  correlation  peaks  . 


more  than  two  channels  implementing  measurements  on  more  than  two  distortion  pa¬ 
rameters.  To  keep  this  work  within  reasonable  limits,  the  discussion  here  was  restricted 
to  the  problem  of  three  parameter  distortion  invariant  pattern  recognition  (scale,  shift 
and  rotation).  Laboratory  experiments  using  alphanumeric  characters  demonstrated  ex¬ 
cellent  performance  in  agreement  with  computer  simulations. 

Because  the  overall  recognition  process  is  performed  in  two  stages,  the  main  con¬ 
sideration  is  the  operation  speed  of  every  channel  and  the  time  it  takes  to  adapt  the 
correlator.  The  advantages  and  efficiency  of  the  two-stage  procedure  are  judged  primar¬ 
ily  by  the  speed  criterion.  Special  emphasis  was  put  on  this  subject  by  using  real-time 
components  such  as  SLM’s  and  CCD’s,  both  in  the  scale  measurement  channel  and  in 
the  correlation  channel.  This  lead  to  a  very  rapid  scale  measurement  procedure  (see 
section  2)  while  the  correlation  channel,  using  a  static  a  priori  designed  filter  produces 
the  correlation  results  in  the  speed  of  light.  Thus,  practically,  the  correlator  may  be  used 
in  real  time. 

The  real  time  adaptive  correlator  enables  us  to  use  the  system  for  many  related 
applications,  such  as  iterative  filter  design,  pattern  classification  etc. 
Acknowledgement:  This  work  was  performed  within  the  Technion  Advanced  Opto- 
Electronics  Center  established  by  the  American  Technion  Society  (ATS),  New  York. 
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Application  of  optical  multiple-correlation  to  recognition 
of  road  signs:  the  ability  of  multiple-correlation 
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Abstract.  We  apply  correlation  filters  to  the  location  problem  of  multiple  objects  in  a  real 
scene  and  discuss  the  ability  of  optical  correlators.  Firstly,  we  design  a  correlation  filter  to 
reduce  false  signals  caused  by  background  and  noise  in  an  input  scene.  Then  we  introduce  a 
color  processing  for  locating  multiple  road  signs.  The  performances  of  these  filtering  are 
verified  by  the  computer  simulations. 


1.  Introduction 

Optical  correlators  are  one  of  most  powerful  techniques  for  locating  multiple  objects  and  they 
are  expected  as  a  real-time  processor  for  a  real  world  scene.  Processing  for  a  real  world  scene 
requires  robustness  against  object  distortions,  noise,  and  background  of  an  input  scene. 
Especially,  robustness  against  background  is  very  important  because  many  unknown  images 
will  appear  in  the  background  of  real  scenes.  These  unknown  images  are  too  many  to  use  as 
training  images  in  the  filter  design  process. 

Many  correlation  filters  have  been  developed  to  achieve  distortion  invariant 
processing  (e.g.  [1]“[4]).  However,  they  are  mainly  designed  to  achieve  the  invariance  for  the 
object  shape  distortion  and  the  robustness  against  the  background  has  not  investigated  well. 

In  this  paper,  we  apply  multiple  correlation  filters  to  a  location  problem  for  multiple 
objects,  that  are  road  signs  here,  in  a  real  scene  and  discuss  their  abilities  of  locating  and 
robust  processing. 


2.  Design  of  robust  filter  for  unknown  background 

A  final  result  of  locating  is  obtained  by  thresholding  the  correlation  pattern  of  an  input  image 
and  a  correlation  filter.  Therefore,  it  is  needed  to  discriminate  the  true  signals  from  the  false 
signals  caused  by  noise,  background,  and  sidelobes  around  a  correlation  peak  by  an 
appropriate  threshold  level.  Some  correlation  filters  have  been  proposed  (e.g.  [l]-[4])  which 
can  suppress  the  sidelobes  or  reduce  the  ill  effects  of  noise. 

To  get  an  optimum  filter  improving  the  discrimination,  we  examined  the  abilities  of 
well-known  filters  to  reduce  the  false  signals.  We  designed  5  filters,  the  minimum  noise  and 
correlation  energy  (MIN ACE)  filters  [3]  with  different  noise  variance,  from  the  training 
images  in  Fig.l  and  evaluated  the  performance  of  them  by  applying  to  a  real  scene.  The 
designed  MINACE  filters  are  to  detect  the  #0  and  #1  patterns  of  Fig.l, 

Table  1  shows  the  characteristics  of  filters;  the  noise  variance  used  at  the  filter 
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#1  #3  #5  #7  #9  #11 


Fig.l  Training  set  images  for  the  evaluation  of  correlation  filter. 


Table  1.  Evaluation  result  of  MINACE  filter  by  changing  noise  variance 


filter 

energy 

PSR 

maximum  value 
for  true  pattern 

maximum  value 
for  false  pattern 

true 

false 

inside 

outside 

inside 

outside 

SDF  filter 

10.86 

0.45 

0.15 

1.0 

0.10 

0.31 

0.39 

a2=50 

10.90 

0.57 

0.17 

1.0 

0.09 

0.29 

0.35 

q 

K) 

II 

O 

11.44 

0.96 

0.21 

1.0 

0.06 

0.19 

0.20 

a2=l 

15.72 

0.96 

0.16 

1.0 

0.02 

0.06 

0.07 

0^=0 

34.23 

0.94 

0.08 

1.0 

0.02 

0.03 

0.03 

synthesis  on  the  first  column,  the  energy  of  correlation  filter  (the  sum  of  the  squares  of  the 
pixel  values  of  filter)  on  the  second  column,  the  average  ratios  of  the  peak  intensity  to 
sidelobe  intensity  (PSR)  for  the  true  signal  and  for  the  false  signal  on  the  third  and  forth 
columns,  respectively,  and  maximum  pixel  values  inside  the  peak  area  and  outside  the  peak 
area  on  the  fifth  to  eighdi  columns.  The  peak  area  in  this  case  is  the  central  10x10  pixel  region. 

A  filter  with  lower  filter  energy  can  more  reduce  the  ill  effects  of  noise  and  a  filter 
with  larger  PSR  for  the  true  pattern  generates  smaller  sidelobes.  However,  any  of  designed 
filters  does  not  have  both  of  a  low  filter  energy  and  a  large  PSR  together. 

Figure  2  shows  the  coirelation  results  by  applying  the  designed  MINACE  filters  to 
the  test  scene  in  Fig. 2  (a).  The  brightness  of  these  results  is  reversed.  Each  threshold  level  for 
the  discrimination  was  determined  to  be  the  largest  value  under  the  condition  in  which  two 
objects  in  the  test  scene  can  be  detected  at  least.  In  this  experiment,  Fig.2  (d)  shows  the  best 
result  to  reduce  false  signals,  while  other  results  have  many  false  signals  as  shown  in  Fig.2  (b), 


Fig.  2  Results  of  MINACE  filtering  and  SDF  filtering  rai  the  test  scene  (a). 
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(c),  (e)  and  (f)  due  to  the  large  filter  energy  and  the  poor  PSR. 

These  results  denote  that  a  correlation  filter  can  be  designed  to  reduce  false  signals 
caused  by  background  and  a  filter  with  a  small  filter  energy  and  a  large  PSR  for  true  pattern  is 
preferable.  It  seems  difficult  to  eliminate  all  false  signals  by  using  a  single  correlation  filter. 


3.  Color  image  processing  for  locating  road  signs 

Single  correlation  filtering  described  in 
the  previous  section  can  not  eliminate 
false  signals  perfectly.  To  reduce  the  false 
signals,  we  introduce  multiple  correlation 
to  perform  a  color  image  processing  as 
shown  in  Fig.3. 

In  this  processing,  an  input 
image  separates  to  color  component 
images  (red,  green,  blue  and  gray 
components),  and  correlation  filters  are 
applied  to  individual  color  components. 

The  final  result  is  obtained  by  applying 
AND  operation  to  all  component  results; 
i.e.  locating  signals  appear  only  when  the  objects  are  detected  at  the  same  position  in  all 
component  images.  To  improve  the  scale  invariance,  each  component  processing  is  performed 
by  multiple  correlation,  in  which  different  size  of  objects  can  be  detected  by  different  filters. 

We  applied  this  color  processing  to  the  problem  of  locating  red  road  signs  in  a  real 
scene.  In  this  application,  we  used  only  two  color  components;  i.e.  gray  component  and  red 
component,  because  all  objects  have  mainly  gray  and  red  components.  Each  correlation  filter 
is  synthesized  with  33  training  images  and  9  types  of  correlation  filers  are  provided  for 
detecting  various  sizes  of  object. 

Figure  4  shows  the  example  of  correlation  peak  values  obtained  by  9  correlation 
filters  and  objects  with  various  sizes.  The  horizontal  axis  denotes  the  object  size  and  the 
vertical  axis  denotes  the  correlation  peak  value.  It  is  demonstrated  in  Fig.4  that  almost  all  the 
size  of  object  can  be  detected  by  one  of  these  filters  by  an  appropriate  threshold  level. 

Figure  5  shows  the  computer  simulation  results  by  applying  this  color  processing  to 
three  different  test  scenes  shown  in  Fig.  5  (a),  (b)  and  (c);  each  of  which  has  three  objects, 

correlation  peak 


0-2  0.3  0.4  0.5 

Fig.  4  Correlation  peak  values  obtained  by  the  SDF 

filters  (flO,  fl5, f50)  and  different  size  objects. 


I  input  color  image  | 


multiple 
filter  ■)  correlation 


result 

Fig.  3  Procedure  for  color  image  processing. 
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(d)  color  processing  on  (a)  (e)  color  processing  on  (b)  (f)  color  processing  on  (c) 

Fig.  5  Results  by  applying  color  image  processing  to  the  three  test  scenes. 


four  objects,  and  two  objects,  respectively.  These  test  scenes  have  different  image  complexity. 
Figure  5  (d),  (e)  and  (f)  show  the  processed  results  for  test  scenes  (a),  (b)  and  (c), 
respectively.  Perfect  result  is  obtained  in  Fig.5  (d),  i.e.  three  objects  can  be  perfectly  located 
without  false  signals  while  single  correlation  filtering  could  not  eliminate  the  false  signals  as 
described  in  the  previous  section.  In  other  results  in  Fig.5  (e)  and  (f),  only  few  false  signals 
appear  though  perfect  locating  cannot  be  performed.  These  results  denote  the  great  ability  of 
multiple  correlation  to  achieve  robust  processing  against  the  background. 


4.  Conclusion 

To  investigate  the  ability  of  optical  correlators,  we  applied  the  multiple  correlation  to  the 
problem  of  locating  multiple  road  signs  in  a  real  scene.  By  the  computer  simulations,  we 
showed  that  a  well-designed  correlation  filter  can  reduce  false  signals  caused  by  background 
of  an  input  image,  but  it  is  difficult  to  eliminate  all  false  signals  by  a  single  correlation  filter. 
Then  we  introduced  a  multiple  correlation  to  perform  a  color  image  processing.  The 
computer  simulation  results  of  color  processing  denoted  a  multiple  correlation  has  a  great 
ability  to  suppress  false  signals  compared  with  the  single  correlation.  However,  tiiis  color 
processing  cannot  perfectly  succeed  to  locate  road  signs  in  a  real  scene.  More  robust 
processing  will  te  needed  for  the  real  world  scene  recognition. 
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Abstract.  We  propose  a  new  algorithm  for  the  design  of  multiple-object  discriminant 
correlation  filters.  This  technique  makes  it  possible  to  control  the  characteristics  of  the  filter. 
By  considering  the  properties  of  recording  media  in  the  filter  synthesis,  improvements  in  filter 
design  are  possible.  Results  of  computer  simulations  are  demonstrated. 


1.  Introduction 

The  technique  of  using  matched  spatial  filters  for  pattern  classification  or  pattern  recognition 
has  been  well  studied.  Many  methods  have  been  proposed  to  use  the  filters  for  detection  of 
patterns  in  the  presence  of  noise  and  distortion,  or  for  multiple-object  classification  (e.g.,  [1]). 
However,  as  the  functions  of  the  filter  increase,  the  more  information  that  has  to  be  recorded 
on  it.  Therefore,  if  recording  media  have  insufficient  gray  level,  the  filter  cannot  be  recorded 
correctly.  Because  of  this  inaccuracy,  the  expected  correlation  response  is  not  obtained. 

We  studied  the  case  of  recording  correlation  filters  on  discrete  type  recording  media, 
such  as  discrete  detour  phase  computer-generated  holograms  (CGHs).  In  this  type  of  CGHs, 
the  Fourier  coefficients  of  filter  function  are  encoded  discretely.  Therefore,  the  amplitude  and 
phase  components  for  each  pixel  of  the  filter  are  quantized.  We  studied  the  effects  of 
quantization  in  the  correlation  peak  value  for  the  cases  of  two  popular  types  of  multiple-object 
discriminant  filter,  namely,  synthetic  discriminant  function  (SDF)  filter  [1]  and  minimum 
average  correlation  energy  (MACE)  filter  [2].  It  was  shown  that  the  effects  are  not  negligible. 

In  this  paper,  we  propose  a  new  type  of  spatial  filter  for  pattern  classification.  The 
proposed  filter  is  calculated  by  using  a  simulated  annealing  (SA)  algorithm  [3]  with 
considering  the  properties  of  a  recording  medium.  The  filter  is  convenient  to  implement  on  a 
recording  medium  which  has  poor  gray  level.  In  addition,  the  characteristics  of  the  filter,  such 
as  correlation  peak  value  at  the  origin,  sidelobe  level  and  sensitivity  for  noise  and  distortion, 
can  be  controlled  parametrically  by  the  method.  Results  of  computer  simulations  showed  that 
the  filter  had  the  expected  correlation  response. 


2.  Effects  of  quantization  on  correlation 

In  this  section  we  show  the  effects  of  quantization  on  the  correlation  results.  We  studied  the 
case  of  recording  filters  as  the  discrete  detour  phase  CGHs,  which  are  the  modified  version  of 
Lohmann  type  CGHs  [4]  and  are  suitable  for  recording.  The  CGHs  consist  of  small  regions 
or  cells  corresponding  to  each  Fourier  coefficient.  Each  cell  consists  of  more  than  one  subcell. 
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Fig.  1.  Training  patterns. 


a  ace 


(al)  (a2)  (cl)  (el) 
Fig.  2.  Test  patterns. 


Table  1.  Correlation  responses  of  the  filters 


input  pattern  SDF  MACE  qSDF  qMACE 


al 

1.00 

1.00 

1.19 

0.70 

a2 

1.00 

1.00 

1.08 

0.73 

cl 

0.00 

0.00 

0.20 

0.01 

el 

0.00 

0.00 

0.11 

0.00 

In  each  cell,  the  amplitude  component  of  the  coefficient  is  controlled  by  the  number  of  open 
subcells  and  the  phase  one  is  determined  by  the  positions  of  open  subcells.  If  each  cell 
consists  of  4  X  4  subcells,  the  amplitude  and  phase  components  are  quantized  to  5  and  4, 
respectively. 

To  estimate  the  effects  of  quantization  on  the  correlation  results,  we  calculated  the  SDF 
filter  and  MACE  filter.  Figure  1  shows  the  training  patterns  used  for  filter  synthesis.  Each 
pattern  was  decimated  to  32  x  32  pixel  resolutiai.  The  filters  were  designed  to  distinguish  the 
character  "a"  from  the  others.  Therefore,  we  specified  the  correlation  peak  values  of  1.0  and 
0.0  for  patterns  of  "a"  and  the  others,  respectively.  We  also  calculated  the  quantized  versions 
of  the  filters  encoded  with  5  amplitude  and  4  phase  levels.  These  filters  shall  be  called  qSDF 
and  qMACE,  respectively.  Figure  2  shows  the  test  patterns  used  for  the  estimation.  The 
patterns  are  included  in  the  training  patterns.  Table  1  lists  the  values  of  intensity  at  the  origin 
of  correlation  plane.  The  correlation  values  deviate  by  20  percent  (SDF)  or  30  percent 
(MACE)  from  which  specified  values.  These  effects  detract  from  the  advantage  of  SDF  or 
MACE  filter  that  the  correlation  values  are  specifiable. 


3.  Filter  synthesis 


We  applied  a  simulated  annealing  (SA)  algorithm  to  the  construction  of  correlation  filters 
recorded  as  discrete  detour  phase  CGHs.  When  we  apply  the  algorithm  to  an  optimization 
problem,  an  energy  function  to  be  minimized  has  to  be  determined.  In  this  paper,  we 
introduce  three  target  values  as  the  degree  of  the  characteristics  of  the  filter,  namely, 
correlation  peak  value  at  the  origin,  sidelobe  level  and  sensitivity  for  noise  and  distortion. 
Thus  we  use  the  weighted  sum  of  the  three  target  values  as  the  energy  functicxi. 

The  correlation  peak  values  are  evaluatal  by  the  difference  tetween  the  actual  correlation 
value  and  the  specified  one.  The  target  value  E\  is  defined  as  follows: 


£1  =  I;  {\jj  T’iX.  YyF(X.  r)  dXdY 


l  =  l 


-  c, 


(1) 


where  A"  is  the  number  of  training  patterns,  TifX,  Y)  (i  =  1,  2,  ...,  N)  is  the  Fourier  transform 
of  training  pattern,  F(X,  Y)  is  the  filter  function  to  be  obtained,  cf  is  the  specified  correlation 
value  for  the  training  pattern  and  *  denotes  the  complex  conjugate. 
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The  sidelobe  level  is  estimated  by  the  energy  of  correlation  plane.  The  target  value  E2  is  given 
by, 

I  T'(X,  Y)-F(X,  y)  P  dXdY 

The  sensitivity  for  noise  and  distortion  depends  on  the  transmissivity  of  the  filter  in  the  high 
frequency  area.  Therefore,  we  define  the  target  value  £*3  as 

£3  =  JJ  F{x,  y)  p  dxdY 

Therefore,  the  energy  function  £  to  be  minimized  is  defined  as 

£  =  Ai£i+  >^£2  +  ^  (4) 

where  Ai,  A2  and  A3  are  positive  constants.  The  filter  function  which  minimizes  the  energy 
function  is  obtained  by  ^e  SA  algorithm.  The  characteristics  of  the  filter  depend  on  the 
parameters.  In  other  words,  the  properties  of  the  filter  can  be  controlled  parametrically. 


4.  Results  of  computer  simulations 

We  designed  many  filters  by  the  SA  algorithm  using  the  different  parameters.  Figure  3  shows 
the  typical  correlation  planes  for  (a)  qSDF  filter,  (b)  qMACE  filter,  (c)  filter  designed  by  SA 
with  parameters  (Ai,  A2  >0,  A3  =  0),  (d)  same  as  (c)  with  parameters  (Ai,  A3  >0,  A2  =  0), 
(e)  same  as  (c)  with  parameters  (Ai,  A2,  A3  >  0).  These  filters  designed  by  SA  shall  be  called 
SAl,  SA2  and  SA3,  respectively.  The  input  patterns  are  as  those  shown  in  Fig.  2. 

We  also  tested  the  correlation  response  for  other  patterns  which  are  not  included  in  the 
training  patterns.  The  patterns  used  for  the  test  are  in  Fig  4.  All  of  the  patterns  are  distorted  or 
noisy  versions  of  the  pattern  (al)  of  Fig.  2.  Table  2  and  3  list  the  correlation  peak  values  at 
the  origin  and  the  values  of  maximum  sidelobe  for  the  five  filters,  respectively.  The  results  for 
filters  designed  by  SA  shows  that  the  sidelobe  level  and  the  sensitivity  for  noise  or  distortion 
vary  depending  on  the  parameters.  Filter  SAl  shows  sharp  correlation  peaks,  filter  SA2  has 
tolerance  for  degraded  patterns  and  filter  SA3  has  characteristics  between  SAl  and  SA2. 


Fig.  3  Typical  correlation  planes  for  (a)  qSDF,  (b)  qMACE,  (c)  SAl,  (d)  SA2,  (e)  SA3. 
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Fig.  4.  Test  patterns  ncrt  included  in 
the  training  patterns. 


Table  2.  Correlation  peak  values  for  patterns  not  included  in  the  training  patterns 


input  pattern 

qSDF 

qMACE 

SAl 

SA2 

SA3 

(a) 

0.64 

0.23 

0.02 

1.14 

0.90 

(b) 

0.47 

0.20 

0.01 

0.70 

0.41 

(c) 

0.26 

0.11 

0.18 

0.44 

0.40 

(d) 

0.79 

0.41 

0.66 

0.86 

0.89 

Table  3.  Maximimi  sidelobe  for  patterns  not  included  in  the  training  patterns 


input  pattern 

qSDF 

qMACE 

SAl 

SA2 

SA3 

(a) 

0.21 

0.10 

0.06 

0.64 

0.21 

(b) 

0.19 

0.21 

0.08 

0.55 

0.32 

(c) 

0.15 

0.12 

0.07 

0.31 

0.41 

(d) 

0.15 

0.21 

0.06 

0.41 

0.31 

Sidelote  is  defined  as  the  intensity  in  the  correlation 
plane  excluding  a  5  x  5  region  around  the  origin. 


5.  Summary 

We  have  proposed  a  new  technique  for  the  design  of  multiple-object  discriminant  correlation 
filters.  In  the  filter  synthesis,  improvements  in  filter  design  are  possible  by  considering  the 
properties  of  recording  media.  In  addition,  the  characteristics  of  the  filters  can  be  controlled 
p^metrically. 


References 

[1]  Casasent  D  19^4  Appl  Opt  23  1620-1627 

[2]  Mahalanobis  A,  Vijaya  Kumar  B  V  K  and  Casasent  D  1987  AppL  Opt  26  3633-3640 

[3]  Aarts  E  and  Korst  J  1988  Simulated  Annealing  and  Boltzmann  Machines  (John  Wiley  & 
Sons) 

[4]  Brown  B  R.  and  Lohmann  A  W  1966  AppL  Opt  5  967-969 


Inst.  Phys.  Conf.  Sen  No  139:  Part  III 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  lOP  Publishing  Ltd 


313 


An  Analog  Retina  for  Edge  Detection 


Chunyan  WANG  and  Francis  DEVOS 

Institut  d'Electronique  Fondamentale,  Bat  220,  University  Paris  Sud 
91405  Orsay  Cedex  FRANCE 
Tel.  33-1-69416574 

Abstract  We  present  an  optoelectronic  analog  test  retina  fabricated  in  a  standard  CMOS 
process.  The  retina  is  to  be  reproduced  into  a  pixel  array.  In  each  pixel  there  is  a  photodiode  and 
an  elementary  operator  consisting  of  21  transistors.  As  the  operator  is  a  "programmable"  analog 
cell,  it  is  possible  to  recombine  the  processing  sequences  in  the  circuit  in  order  to  diversify  its 
functions  and  to  achieve  a  spatial  and/or  temporal  differentiation.  By  testing  the  prototype,  the 
performance  for  the  acquisition  and  edge  detection  of  the  circuit  is  evaluated. 


1.  Introduction 

The  retina  is  a  part  of  a  system  of  classification  of  road  signs  with  an  optical  correlation  [1]. 
The  system  consists  of  two  matrices,  each  of  which  has  64*128  cells.  Controlled  by  a  FLC 
micro  mirror  device[2],  the  first  matrix  displays  the  images  sequentially  from  an  "edge 
library".  The  second  one  performs  four  functions:  acquiring  the  signal  of  an  image  carried 
by  the  incident  light;  detecting  the  edges  of  the  image;  displaying  the  edge  image  and 
eventually  outputing  the  processed  data.  The  circuit  presented  here  is  an  elementary  test 
retina  built  for  the  evaluation  of  the  second  matrix. 


2.  Analysis  of  the  problem 

Our  goal  is  to  integrate  a  matrix  of  8192  cells  in  a  reasonable  size  (  «  0.5  cm2).  Therefore, 
the  functions  of  a  cell  should  be  simple  enough  to  be  achieved  with  fewer  than  thirty 
transistors.  To  fulfil  this  task  within  its  limitations  we  have  to  choose  : 

1.  The  implementation  via  an  analog  circuit,  as  there  is  no  question  of  a  digital  one. 

2.  A  compromise  between  the  simplicity  of  the  circuit  and  the  processing  time. 

The  algorithm  Sobel  is  one  of  the  most  common  methods  for  differential  edge 
detection,  but  because  of  the  non-feasability  of  a  large  number  of  the  interconnections  (  due 
to  its  nature  of  an  8  pixel  neighbourhood)  we  didn’t  choose  it.  The  idea  of  implementing  the 
detection  method  is  originated  from  Roberts  [3],  which,  by  means  of  the  convolution,  is  only 
a  four  pixel  neighbourhood.  The  following  are  the  two  modifications  of  the  Roberts  method: 

1/  To  get  the  detection  efficiency  comparable  to  that  of  Sobel,  we  modified  Roberts  by 
introducing  two  supplementary  differentiators  into  the  template  impulse  response  array 
(Fig.l).  The  comparison  of  the  algorithms,  Roberts,  modified  Roberts  and  Sobel  is  made 
using  the  Khoros  simulation.  The  results  show  that  with  the  modified  Roberts  method  the 
compromise  between  the  performance  and  the  complexity  can  be  reached  (Fig.  2). 
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Fig.l  Hri,  Hr2  :  classic  Roberts  differentiators  Hsi,  Hs2 :  supplementary  differentiators 
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Fig.2  (a)  (b)  (c)  (d) 

(a)  Original  image  0>)  Edge  map  of  Sobel  (c)  Edge  map  of  Roberts 

(d)  Edge  map  attained  by  using  the  four  Robert  operators.  Some  diagonal  lines  missing  on  Fig.2-c  show  up  here. 


E  (m,  n) 


F(m,n)  :  original  image  Gi(m,n) :  gradient  in  the  i  ih  direction 

Hi(m,n);  spatial  differentiator  G  (m,n) :  gradient  after  the  compass  E  (m.n) :  edge  map 


Fig.3  General  procedure  for  Roberts  detection 


2/  The  other  modification  is  in  the  procedure  shown  in  Fig.3,  done  in  order  to  reduce 
the  varieties  and  the  amount  of  the  physical  operators  to  be  implanted.  If  the  variation  of  the 
incident  light  between  two  samples  is  negligible,  it  is  possible  to  simplify  the  circuit  by 
extending  the  processing  sequence.  In  short,  the  two  calculations  in  the  procedure  shown  in 
Fig.3  can  be  eliminated  by  means  of  two  replacements  : 

1/  eliminating  "absolute  value":  Gij  =  I  Fi  -Fj  I  =  Max  {( Fi  -Fj ),  ( Fj  -Fi )} 

2/ eliminating  "max  value":  E  =Sign  (Gmax  -  thr.)=  OR{Sign  (Gij  -  thr.),  differentiators) 

[  y  =  sign  (x)  =1  case  ;  otherwise,  y  =  0  ] 

Thus,  the  general  procedure  is  transformed  into  the  iterated  simple  sub-procedures 
(Fig.4-b)  which  are  composed  of  only  three  operations  :  accumulation/subtraction, 
comparison  and  memorisation  of  logic  'T"(OR  in  terms  of  time). 

We  can  make  such  an  arrangement  to  complete  a  calculation  for  each  pixel: 

1/  The  execution  of  the  sub-procedure  can  be  iterated  successively  with  the  eight 
differentiators  (Fig.4-a  )  which  choose  the  combination  of  the  input  data. 

2/  Each  execution  can  be  started  by  the  signal  acquisition.  The  elimination  of  the  input 
memory  is  performed  at  the  cost  of  the  iterative  signal  acquisitions. 


3.  Design  and  implantation  of  the  circuit 

3,1.  Circuit  design  principle 

The  procedure  of  Fig.4-b  makes  it  possible  to  produce  a  cellular  system  (fig.5)  with  a  single 
sequential  SIMD  control.  Functionally,  each  pixel  consists  of : 


Fig.4-b  Simplified  procedure  of  the  modified  Roberts  detection  for  the  "parallel  and  squential"  implantation. 
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1/  an  elementary  operator  which  executes  the  three  basic  operations;  2/  a  photodiode 
for  the  optical  data  acquisition;  3/  the  interconnections  which  make  each  photodiode 
connectable  at  anytime  to  only  one  of  its  four  neighbouring  elementary  operators  according 
to  the  address  code;  4/  a  control  logic  which  programmes  the  function  of  the  whole  system 
(operative  code  and  address  code).  This  control  can  carry  out  the  real  programmable 
algorithms  for  several  spatio-temporal  convolutions. 

3 .2 .  Detail  of  the  elementary  operator  and  evaluation  of  the  performance 

The  diagram  of  the  operator  and  its  four  interconnection  switches  (Ki-4)  are  shown  in  Fig.6. 

The  left  part  is  for  the  acquisition  and  subtraction.  The  input  is  the  photocurrent 
produced  alternatively  by  one  of  the  four  neighbouring  photodiodes.  With  a  double 
integration  (first  additive  then  subtractive)  via  the  sole  current  mirror,  the  difference 
between  the  incident  intensities  is  transformed  into  Vc  which  is  the  voltage  on  the  capacity 
"C"  (Fig.7).  The  bias  of  each  photodiode  is  stablized  by  the  transistor  "Tbias"  in  order  to  get 
the  relatively  linear  transformation  of  Aintensity-voltage. 

In  Fig.6  (right  side)  a  "quasi-static"  binary  memory  cell  has  three  functions  : 
1/  comparing  "Vc"  with  the  threshold  (its  value  depends  on  the  transistor  sizes  here); 
2/  writing  and  memorizing  only  logic  "1",  which  happens  when  the  result  of  the  comparison 
is  "true"  (one  point  of  edge  is  detected);  3/  serving  as  a  shifter.  The  data  on  each  pixel  can  be 
outputed  by  successive  shifting  through  the  entire  neighbourhood. 

The  function  of  a  matrix  with  four  elementary  operators  and  nine  photodiodes  (Fig. 5) 
was  simulated  with  a  Hspice  simulator  in  Cadence  environment  before  the  implantation. 


3 .3 .  Implantation  of  the  prototype 

A  prototype  circuit  is  processed  with  CMOS  technology  :  ES2  1.5  pm  (  dual  layer  metal  and 
single  poly).  This  prototype  (Fig.8)  contains  an  elementary  operator  (as  in  Fig.6,  surface 
size:  50  *  92  jjm2),  four  photodiodes  (each  size  :  30  *  30pm2)  and  a  capacity  of  Ipf  (size  : 
100*100  pm2)  which  will  not  exceed  0.35pf  in  the  final  version  owing  to  the  total  circuit  size. 


Fig.5  Pixel  array  Fig.6  Diagram  of  an  elementary  operator 


rO 

elk 

op 

Vc 

Vout 


Double  Intogratiori  on  Vc 
(  under  test  ) 

additive  Integration  : 
cTk  —  K3  —  1  .  charge  C 
subtractive  integration  : 
cik  —  K4  1  ,  discharge  C 

(  The  photodiode  D3  Is  more 
illuminated  than  ) 

op  —  1  :  write  1 
In  case  Vc  =»•  threshold 


Fig.7  Waveform  shows  the  subtraction  of  the  photocurrents  achieved  by  the  double  integration  on  "Vc". 
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4.  Test  results  of  the  prototype,  conclusion  and  perspectives 

Having  tested  the  prototype  we  then  verified  the  global  linearity  of  the  analog  operator 
(acquisition  and  spatial  or  temporal  differentiation).The  results  (Fig.9)  show  that  when  the 
sample  rate  for  the  incident  image  is  within  the  limits  of  10  kHz,  the  non-linearity 
percentage  was  in  the  lower  10  per  decade  range. 

During  the  tests  we  were  confronted  with  two  problems  :  the  problem  of  cutting  off  the 
photocurrent  by  the  interconnection  switches  (due  to  the  photo-absorption  of  the  light 
incident  upon  the  substrate);  the  strong  sensitivity  of  the  current  mirror  to  the  parametric 
dispersion  in  weak  current  range.  These  two  points  were  considered  in  a  later  version  of  the 
circuit.  In  that  version,  in  order  to  lower  its  potential  enough  to  turn  the  switch  off,  a  control 
signal  is  first  applied  to  a  transistor  follower  whose  source  is  connected  to  a  tiny  additional 
photodiode.  For  the  balance  compensation  of  the  current  mirror  we  inserted  a  capacity 
between  the  two  grids  of  the  mirror  and  adjusted  the  charges  by  Tunnel  Effect  [4]  to 
counter-balance  the  potentials  on  the  two  grids.  Consequently  we  obtained  the  "reflected" 
current  which  followed  closely  the  "original"  one,  the  photocurrent.  Therefore  the  two 
coefficients  (one  for  the  additive  integration  and  the  other  for  the  subtractive  one)  coincide. 

The  work  we  presented  here  gives  an  example  of  the  computing  implementation  that 
adapts  to  an  analog  cellular  circuit.  Its  key  points  :  1/  preserve  the  density  of  the  cellular 
computing;  2/  exploit  the  variety,  density  and  nature  of  the  spatial  distribution  —  the 
utilizable  mechanism;  3/  verify  the  resolution  and  global  linearity  of  the  completed 
operators;  4/  take  into  account  the  spatial  dispersion  for  eventual  compensation;  5/  preserve 
the  minimal  resources  of  configuration  (switches,  operation  codes,  parameters,  etc)  for  the 
minimal  adaptability  of  the  control  sequences. 
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Fig.8  Photograph  of  the  prototype 


Fig.9  Characteristic  of  the  Vc/AL  transformation 


hut.  PIm.  Conf  Ser.  No  B9:  Part  III 

Paper  presented  at  Opt.  Compiit.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  10 P  Publishing  Ltd 


317 


Optoelectronic  implementation  of  a  phase-retrieval  Vander 
Lugt  correlator. 


Santiago  Vallmitjana,  Arturo  Carnicer,  Estela  Martm-Badosa,  Ignacio 
Juvells. 

Universitat  de  Barcelona,  Laboratori  d’ Optica,  Departament  de  Ffsica 
Aplicada  i  Electrbnica.  Diagonal  647,  E08028  Barcelona,  Spain. 


Abstract.  An  implementation  of  a  Vander-Lugt  correlator,  which  operates 
with  a  single  spatial  light  modulator  is  proposed.  Optical  phase-retrieval 
manipulation,  based  on  the  symmetrization  of  the  input  scene  is  required. 
Theoretical  analysis,  simulations  and  some  experimental  results  are  presented. 

1.  Introduction. 

In  recent  years,  some  authors  have  proposed  the  same  compact  correlation 
architecture  [1-3]  for  different  purposes.  These  setups  are  based  on  the  use  of  a  spatial  light 
modulator  (SLM)  to  introduce  and  display  information,  a  Fourier  lens  system,  a  video¬ 
camera  (CCD)  to  register  light  distributions  and  a  computer  that  controls  the  whole  system 

If  the  setup  performs  as  a  joint 
transform  correlator,  both  the  scene  and  the 
reference  are  displayed  on  the  SLM  and  the 
CCD  in  the  Fourier  plane  registers  the  Joint 
interferences  between  them.  This  power 
spectrum  is  processed  by  a  computer  and 
displayed  again  on  the  same  SLM.  A 
second  optical  Fourier  transform  is  obtained 
and  the  correlation  is  recorded  by  the  CCD. 
Rosen  et  al.  have  proposed  the  use  of  the 
same  architecture  to  implement  Vander-Lugt  correlators  [4],  in  which  the  scene  is  shown  on 
the  modulator  and  its  power  spectrum  is  recorded  by  the  CCD.  To  recover  the  phase,  they 
register  the  joint  interferences  between  the  object  and  a  plane  wave,  and  in  a  second  step, 
the  reference  is  handled  in  the  same  way.  Finally,  the  correlation  between  the  scene  and  the 
target  is  obtained. 

In  this  article  we  present  an  alternative  to  the  original  proposal  that  simplifies  the 
experimental  procedure  and  improves  the  method’s  capabilities. 


(Figure  1). 
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2.  Theoretical  analysis. 

The  idea  of  retrieving  the  phase  lost  in  the  recording  process  is  based  on  the 
symmetrization  of  the  images  usal,  as  in  the  following  expression:  f(x-a,  y)+f(-x-a,-y), 
where  2a  is  the  separation  between  the  original  image  and  its  symmetrical.  Figure  2a  show 
the  symmetrized  scene  (three  satellites).  The  images  should  be  placed  as  close  as  possible, 
without  superposition,  in  order  to  take  advantage  of  the  spatial  bandwidth  of  the  modulator. 
When  there  is  interference  between  the  light  distribution  and  a  plane  wave  in  the  focal  plane, 
the  following  intensity  is  obtained: 

I  I  F(u,v)  I  exp(i<^(u,v))exp(-i27rau/Xf)  -f  |  F(u,v)  |  exp(-i0(u,v))exp(i2Tau/XO  +  C  |  ^ 

=  I  2  I  F{u,v)  I  cos[^(u,v)-2irau/Xf]  +  C  |  ^  (1) 

where  f  is  the  focal  of  the  Fourier  lens  system. 

By  using  a  square-root  input  response  CCD  and  if  the  value  of  the  amplitude  plane 
wave  C  is  high  enough,  the  equation  (1)  becomes, 

2  I  F(u,v)  I  cos[^(u,v)-27rau/Xf]  +  C  (2) 

The  constant  value  C,  can  easily  be  subtracted  by  computing.  The  reference  g(x,y)  can  be 
processed  in  a  similar  way:  g(x-l-a,y)+  g(-x+a,  -y).  Figure  2b  show  the  geometry  used  in 
this  case.  In  the  Fourier  plane,  the  light  distribution  can  be  written  as 

I  I  G(u,v)  I  exp(i7(u,v))exp(i27rau/Xf)  -f-  |  G(u,v)  |  exp(-i7(u,v))exp(-i27rau/X0  +  C  |  ^ 

=  I  2  I  G(u,v)  I  cos[7(u,v)-l-2Tau/Xf]  +  C  |  ^  (3) 

and  finally  considering  the  response  of  the  CCD,  equation  (3)  can  be  obtained  in  the 
following  terms: 

2  I  G(u,v)  I  cos[7(u,v)+2irau/Xf]  +  C  (4) 

The  product  between  |  F(u,v)  |  cos[^(u,v)-27rau/Xf]  and  |  G(u,v)  |  cos[7(u,v)+2irau/Xf] 
is  performed  in  the  computer: 

I  F(u,v)  I  I  G(u,v)  I  [exp(i<^(u,v))exp(-i2Tau/Xf)  +  exp(-i(^(u,v))exp(i27rau/Xf)] 

[exp(i7(u,v))exp(i2Tau/Xf)  -t-  exp(-i7(u,v))exp(-i2Tau/Xf)]  =  (5) 

I  F(u,v)  I  I  G(u,v)  I  [2cos(4)(u,v)+7(u,v))  -I- 
exp(i^(u,v)-i7(u,v)-4Triau/Xf)  -f-  exp-(i(^(u,v)-i7(u,v)-4xiau/Xf)]  (6) 

This  result  is  displayed  in  the  SLM  and  in  the  focal  plane  we  obtain  the  cross- 
correlation  between  f(x,y)  and  g(x,y)  at  a  distance  +2a  from  the  origin.  As  a  conclusion,  the 
function  |  G(u,v)  |  cos[7(u,v)-f  27rau/Xf]  can  be  considered  as  a  classical  matched  filter 
(CMF).  If  the  amplitude  of  this  filter  is  modified  properly,  several  spatial  correlation  filters 
can  be  designed. 
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3.  Simulated  results 

Several  simulated  results  have  been  carried  out  in  order  to  show  the  procedure 
described.  Figure  3a  presents  the  cross-correlation  between  the  scene  of  Figure  2a  and  the 
reference  of  Figure  2b.  As  the  filter  has  not  been  processed,  it  corresponds  to  the  CMF. 

A  second  set  of  simulations  has  been  carried  out  taking  into  account  a  simplification 
method  obtained  by  binarizing  the  distributions  of  equation  (6).  The  binarization  is  made 
considering  the  zero  value  as  the  threshold.  As  a  result,  Figure  3b  shows  a  representation 
of  the  cross-correlation  between  the  scene  and  the  satellite  in  Figure  2b,  obtained  with  this 
method. 


Figure  3a.-  Simulated  cross-correlation  (CMF  Figure  3b.-  Simulated  cross-correlation 

filter).  (binarization  method). 


4.  Experimental  results. 

A  single  SLM  correlator  which  operates  with  a  low  cost  liquid  crystal  television 
(LCTV)  has  been  implemented  [5].  The  LCTV  used  was  removed  from  an  Epson  lOOPS 
videoprojector  [6].  The  symmetrized  scene  is  displayed  in  the  LCTV.  A  CCD  videocamera 
with  a  square  root  input  response  was  connected  to  an  8-bit  digitizer  board.  The  LCTV  is 
illuminated  and  the  light  distribution  in  the  Fourier  plane  of  the  lens  system  interferes  with 
a  plane  wave.  This  intensity  is  registered  by  the  CCD  and  stored  in  the  computer  memory. 
In  a  second  stage,  the  power  spectrum  of  the  symmetrized  reference  is  stored  in  the  same 
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way.  Then,  the  product  betw^n  these  distributions  is  computed  and  the  result  is  displayed 
in  the  LCTV.  Finally,  the  correlation  is  obtained  in  the  focal  plane  of  the  Fourier  lenses. 

The  experiments  have  been  carried  out  with  this  correlator  using  the  matched  filter 
and  binarizing  the  distribution  in  the  LCTV.  Figure  4a  show  the  cross-correlation  zone  using 
the  CMF  and  Figure  4b  the  det^tion  obtained  with  the  binary  method. 


Figure  4a.-  Optical  correlation  (CMF  filter). 


Figure  4b.-  Optical  correlation  (binarization 
method). 


5.  Conclusions 

The  use  of  single  SLM  architecture  allows  the  simultaneous  implementation  of 
Vander-Lugt  and  joint  transform  correlator  with  similar  recognition  capabilities.  The 
possibility  of  incrementing  the  discrimination  using  the  non-linear  properties  of  binarization 
in  Fourier  plane  are  shown. 
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Abstract.  This  paper  describes  a  joint  transform  correlator,  whose  input  is  provided  by  two 
acousto-optic  cells.  The  performance  of  the  correlator  is  analysed  and  modelled  and  results 
from  a  practical  system  are  presented.  It  is  shown  that  the  system  is  capable  of  detecting  and 
direction  finding  spread  spectrum  signals  which  are  more  than  10  dB  below  the  noise  level. 


1.  Introduction 

When  attempting  to  detect  a  spread  spectrum  signal  of  unknown  waveform  and  bandwidth, 
which  may  be  10  dB  or  more  below  the  noise  level,  a  single  receiver  is  generally  unable  to 
form  the  matched  filter  for  the  transmitted  waveform.  Since  a  matched  filter  is  essentially  a 
correlator,  an  alternative  approach  is  to  cross-correlate  the  signals  from  two  spatially  separated 
receivers  assuming  that  the  noise  in  the  two  receivers  is  uncorrelated.  The  computation  rates 
required  to  perform  the  correlations  digitally  are  high  for  signals  which  may  have  bandwidths 
of  the  order  of  100  MHz.  Acousto-optical  correlators  offer  a  possible  solution  since  they  are 
able  to  perform  analogue  cross-correlation  in  real  time.  Three  basic  architectures  are  available. 
Space-integrating  correlators  using  two  AO  cells  require  the  input  to  one  of  the  cells  to  be  time 
reversed  [1];  this  can  be  done  [2]  but  the  process  is  inefficient.  Time-integrating  correlators 
[1]  do  not  require  time  reversal  of  signals  but  they  suffer  from  unwanted  bias  terms  which  tend 
to  saturate  the  detector  array  and  limit  integration  time.  The  proposed  joint  transform 
correlator  (JTC),  which  combines  space-integrating  and  time-integrating  attributes,  suffers 
from  none  of  these  drawbacks.  No  time  reversal  is  required  and  unwanted  terms  are  easily 
separated  from  the  required  correlation. 

2.  Correlator  architecture. 

A  schematic  diagram  of  the  system  is  shown  in  Figure  1.  It  is  a  standard  joint  transform 
correlator  [3]  in  which  the  inputs  are  provided  by  the  first  order  diffracted  light  from  two 
identical  AO  cells.  The  lenses  and  spatial  filters  required  to  extract  the  first  order  light  from  the 
cells  have  been  omitted  for  clarity.  The  input  function  for  a  joint  transform  correlator,  with  two 
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Figure  1.  The  acousto-optic  joint  transform  correlator  -  schematic 

objects  separated  by  a  distance  2a,  is  of  the  form 

o(x)  =  f(x-a)  +  g(x  +  a)  (1) 

It  is  straightforward  to  show  that  if  the  Fourier  transforms  of  f(x)  and  g(x)  are  written  as 
|F(u)|exp(j(p(u)}  and  |G(u)|exp{j(|)(u)}  respectively  then  the  response  of  the  square  law 
detector  of  the  spatial  light  modulator  (SLM)  will  be 

|0(u)|'  =  |F(u)f  +  |G(u)|^  +  |F(u)|  •  |G(u)|  cos{2ua  +  (p(u)  -  (|)(u)}  (2) 

The  final  term  in  equation  (2)  represents  a  set  of  sinusoidal  fringes  of  spatial  frequency  2a, 
amplitude  modulated  by  the  amplitude  of  the  cross  spectral  density  (CSD)  of  f(x)  and  g(x)  and 
phase  modulated  by  the  phase  of  the  CSD.  In  the  acousto-optic  JTC  the  input  functions  are  of 
the  form  f(x-a-Vt)  and  g(x+a-Vt),  where  V  is  the  acoustic  velocity  in  the  cells.  A  rigorous 
analysis  of  the  acousto-optic  system  will  not  be  presented  due  to  space  limitations.  However  it 
is  clear  that,  since  the  acoustic  velocity  in  the  two  cells  is  the  same,  the  two  functions  will 
maintain  their  separation  as  they  propagate  across  the  cells.  The  spatial  frequency  of  the  fringe 
pattern  will  therefore  remain  constant.  Consider  first  the  case  where  the  cells  are  illuminated 
only  for  the  period  that  the  two  functions  are  fully  within  the  cells.  The  light  intensity  at  the 
detector  will  be  as  given  in  equation  (2)  because  the  extra  phase  term  introduced  by  the  term 
Vt  will  cancel  out.  This  places  quite  a  severe  restriction  on  the  length  of  signal  that  can  be 
processed;  the  maximum  length  of  an  AO  cell  window  is  around  50  ps.  Consider  now  the  case 
where  the  signals  are  longer  than  the  cell  window  and  the  cells  are  constantly  illuminated.  The 
separation  of  the  signals  will  still  be  maintained  at  2a  plus  a  term  due  to  any  (constant)  time 
shift  between  the  signals  so  the  spatial  frequency  of  the  fringes  will  remain  constant.  The  light 
intensity  at  the  detector  at  any  instant  will  be  of  the  form  of  equation  (2)  where  F(u)  and  G(u) 
now  represent  the  transforms  of  the  signals  within  the  cells  at  that  instant.  As  the  signals 
propagate  through  the  cells  a  composite  intensity  pattern  will  be  built  up  on  the  SLM  by  time 
integrating  on  the  detector. 

3.  Computer  simulations 

Simulations  have  been  carried  out  using  Mathcad  to  investigate  the  performance  of  the  system. 
Figure  2(a)  shows  a  snapshot  of  the  joint  transform  obtained  at  a  particular  instant  as  a  chirp  of 
length  100  ps  and  bandwidth  1  MHz,  centred  on  15  MHz,  passes  through  a  pair  of  Bragg  cells 
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Figure  2.  Simulations:  (a)  a  snapshot  joint  transform  and  (b)  an  integrated  joint  transform. 


whose  time  aperture  is  50  ps.  Figure  2(b)  shows  the  result  of  integrating  the  joint  transform 
over  the  150  ps  interval  between  the  chirp  first  entering  the  cells  and  finally  leaving  them.  The 
spatial  frequency  of  the  fringes  remains  constant  and  would  require  a  spatial  light  modulator 
with  an  aperture  of  50  mm  and  a  resolution  of  30  line-pairs/mm.  Such  devices  are  available. 
Figure  3  shows  the  results  of  simulations  carried  out  using  a  chirp  of  length  100  ps  and 
bandwidth  10  MHz,  centred  on  15  MHz,  buried  in  noise  of  bandwidth  30  MHz  with  a  signal  to 
noise  ratio  (SNR)  of -10  dB.  In  Figure  3(a)  the  relative  time  shift  (x)  between  the  chirps  in  the 
two  cells  was  zero;  in  Figure  3(b)  x  =  500  ns.  The  plots  show  the  square  modulus  of  the 
Fourier  transform  of  the  joint  transform,  which  is  the  output  that  would  be  expected  from  the 
detector  array.  The  output  values  have  been  normalised  to  a  peak  value  of  unity.  The  time  shift 
is  clearly  visible  and  the  output  SNR,  calculated  as  the  peak  to  mean  ratio  of  the  output 
function,  is  around  14  dB.  This  is  a  reasonable  result;  the  time-bandwidth  product  of  the  chirp 
is  1000  giving  a  signal  processing  gain  of  33  dB.  The  missing  9  dB  can  be  attributed  to  signal- 
noise  and  particularly  noise-noise  correlation  that  would  not  occur  in  a  matched  filter  [4], 

4.  Experimental  results 


The  system  has  been  tested  in  principle  using  a  single  Bragg  cell  through  which  two  identical 
chirps  were  passed.  The  cells  used  had  a  time  aperture  of  50  ps  and  a  bandwidth  of  30  MHz 
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Figure  3.  Simulated  array  output  with  input  SNR  =  -10  dB  for  (a)  t  =  0,  (b)  x  =  500  ns 
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Figure  4.  Experimental  joint  transform 


Figure  5.  Experimental  correlation  output 

centred  on  45  MHz.  The  chirps  used  were  of  length  5  p.s  and  bandwidth  10  MHz  centred  on 
45  MHz.  Figure  4  shows  the  joint  transform  obtained  when  the  spacing  between  the  two  chirps 
within  the  cell  was  5  ps.  Figure  5  shows  the  correlation  output  obtained  when  the  above  chirps 
were  buried  in  noise  with  noise  bandwidth  30  MHz  and  a  SNR  of  around  -5  dB.  This  result 
was  obtained  by  digitising  the  joint  transform  recorded  by  a  2048  element  detector  array  and 
plotting  the  square  modulus  of  its  Fourier  transform.  The  apparent  asymmetry  in  the 
correlation  function  is  due  to  the  reduced  response  of  the  array  at  higher  spatial  frequencies. 
The  output  SNR  ratio  is  around  8  dB,  which  again  is  a  reasonable  figure.  The  time  bandwidth 
product  of  the  chirp  was  50  (signal  processing  gain  of  20  dB)  and  the  input  SNR  was  -5  dB. 

5.  Conclusions 

It  has  been  shown  that,  in  principle,  the  acousto-optic  joint  transform  correlator  described  is 
capable  of  detecting  spread  spectrum  signals  buried  in  noise  by  correlating  the  signals  received 
at  two  spatially  separated  receivers.  The  signal  processing  gain  of  the  system  is  worse  than  a 
matched  filter  by  up  to  9  dB,  This  is  to  be  expected  since  a  matched  filter  is,  by  definition,  the 
system  which  gives  the  greatest  signal  to  noise  enhancement.  Analysis  of  this  system  shows 
that  its  signal  processing  gain  will  be  at  least  3  dB  worse  than  a  matched  filter  [4],  the 
degradation  being  due  to  extra  signal-noise  and  noise-noise  correlation  terms.  The  system  also 
has  potential  for  direction  finding  since  the  time  of  arrival  at  the  two  receivers,  and  hence  the 
separation  of  the  signals  in  the  Bragg  cells,  depends  on  the  angle  of  arrival  of  the  signal. 
Bearing  resolution  will  depend  upon  the  separation  of  the  receivers  and  the  width  of  the 
compressed  correlation  pulse. 
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Abstract.  Both  a  theoretical  analysis  and  practical  results  are  presented  for  a  non-heterodyning 
acousto-optic  space  integrating  correlator  that  uses  the  zeroth  diffraction  order  to  produce  a  true 
correlation  function  containing  both  amplitude  and  phase  information. 


1.  Introduction 


The  structure  of  the  zeroth  order  non  heterodyning  space  integrating  correlator  is  as  shown  in 
figure  1. 
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Figure  I.  Physical  structure  of  zeroth  order  space  integrating  acousto-optic  correlator. 

The  basic  architecture  is  similar  to  many  correlators  previously  described  in  the  literature 
[e.g.  1,2,3],  but  there  are  two  important  differences.  The  spatial  filters  select  all  or  part  of  the 
zeroth  diffraction  order.  Most  correlators  use  the  first  order,  sometimes  treating  the  zeroth 
order  as  if  it  contained  no  information.  In  reality,  the  zeroth  order  contains  as  much 
information  as  all  the  other  orders  put  together,  but  it  can  be  difficult  to  access  due  to  large 
optical  power  incident  on  a  photodetector  causing  saturation  or  excessive  shot  noise. 
However,  it  can  be  shown  that  the  zeroth  diffraction  order  has  sub-orders  that  can  be  used 
very  effectively.  The  second  important  difference  lies  in  the  way  the  signals  are  introduced 
into  the  acousto-optic  cells.  A  common  technique  is  to  insert  the  signals  as  double  sideband 
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suppressed  carrier  modulation  of  a  carrier  at  the  centre  frequency  of  the  acousto-optic  cell. 
The  modulation  used  here  is  double  sideband  large  carrier.  The  difference  is  crucial. 


2.  Information  in  the  zeroth  order. 

Using  the  co-ordinate  frame  such  that  the  x  axis  lies  in  the  direction  of  propagation  of  the 
acoustic  wave  with  its  origin  at  the  transducer  and  the  2  axis  lies  in  the  direction  of 
propagation  of  the  incident  optical  plane  wave,  and  assuming  Raman-Nath  diffraction  the 
optical  wave  at  the  output  of  the  acousto-optic  celt  can  be  described  by  [3] 

u(x,l)  =  Re{exp[X®<,(-fe)]exp[-;/Jv(/-f-)]|rect(f)  (1) 

where  cOo  and  k  are  the  angular  frequency  and  wave  vector  of  the  input  optical  signal,  v(t)  is 
the  electrical  input  signal  to  the  cell,  V  is  the  acoustic  wave  velocity,  rQct(x/W)  is  the 
windowing  function  due  to  the  acousto- optic  cell  of  aperture  width  W,  and  P  is  a  modulation 
index  which  is  a  function  of  input  signal  power  and  cell  parameters.  v(t)  will  usually  comprise 
a  carrier,  at  the  centre  frequency  of  the  acousto-optic  cell,  modulated  by  a  signal  of  interest. 
We  will  only  consider  amplitude  modulation  here,  but  two  types  of  amplitude  modulation  are 
possible  and  the  difference  is  crucial.  The  zeroth  order  correlator  makes  use  of  information  in 
the  zeroth  diffraction  order.  This  information  is  not  available  if  we  use  double  sideband 
suppressed  carrier  modulation  but  it  is  available  if  we  use  double  sideband  large  carrier 
modulation.  The  electronic  input  signal  to  the  acousto-optic  cell  then  takes  the  form 

v(^)  =  (l  +  w  a(0)  sin(tu^/)  (2) 

where  a(tj  is  the  input  signal  of  interest,  m  the  amplitude  modulation  index,  and  Oc  the  angular 
frequency  of  the  carrier.  The  optical  wave  at  the  output  of  the  acousto-optic  cell  becomes 

u(x,t)  =  Re{exp[Xffl,'-^z)]exp[->>9(l  +  nia(r-f))sin(a),/-A:^q]  }rect(-^)  (3) 

where  Kc  is  the  acoustic  wave  vector  at  the  carrier  frequency.  Expanding  out  the  second 
exponential  term  in  this  expression  and  including  second  order  terms  enables  us  to  obtain  an 
expression  for  the  plane  wave  component,  hi(x,t),  that  gives  the  zeroth  diffraction  order.  Most 
papers  on  acousto-optic  correlation  [e.g.  1,3]  include  only  first  order  terms  in  this  expansion 
since  this  is  sufficient  to  describe  the  first  diffraction  order  usually  used.  If  we  perform  this 
expansion  and  gather  terms  up  to  second  order  then  we  obtain 

°u{x,t)  =  + 

If  we  consider  a  sinusoidal  component  of  a(t)  of  the  form  cos6 0)^0,  then  the  component  of 
''u(x,f)  due  to  such  a  component  is 

|l  -  4-[l  +  »'  cos((t),/  -  A:„x)] ^  I  cos(iy,(  -  kz)  rect(f ) 


(5) 
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Further  expansion  of  this  expression  reveals  [4]  that  the  zeroth  diffraction  order  can  be  broken 
down  into  diffraction  sub-orders  0,n  ,  where  n  =  0,  ±1,  ±2.  The  0,  ±2  sub-orders  are  very  small 
but  the  0,  ±1  sub-orders  are  of  significant  amplitude  and,  when  focused  into  the  Fourier 
transform  plane  of  a  lens,  comprise  the  Fourier  transform  of  a(t),  convolved  with  the  Fourier 
transform  of  the  window  function  rectfx/fVJ.  Figure  3  shows  a  prediction  of  the  form  of  the 
0,0  and  0,  ±1  diffraction  sub-orders,  generated  using  Mathcad,  for  an  input  signal  a(t)  which  is 
a  linear  chirp  pulse  sweeping  from  8  to  12  MHz  in  20  ps,  with  m  =  0.3  and  ^  =  0.5  and  an 
acousto-optic  cell  with  a  time  window  of  50  ps. 


-15  -10  -5  0  5  10  15 


^  MHz 

Figure  2.  Predicted  intensity  distribution  of  the  zeroth  diffraction  order,  showing  0,0  and  0,±1  sub-orders 


3.  Analysis  of  the  zeroth  order  correlator 

In  the  zeroth  order  space  integrating  correlator,  as  shown  in  figure  1,  the  signal  input,  vi(t),  to 
the  first  cell  is  time  reversed  so  that  the  signal  actually  delivered  to  the  transducer  is  vj(T-i), 
where  T  is  the  time  window  of  the  cell.  The  travelling  wave  in  the  acousto-optic  cell  can  then 
be  described  as  proportional  to  Vi(x-Vt).  If  the  second  signal  is  not  time  reversed  but  the  sense 
of  the  X  axis  is  reversed  while  maintaining  its  origin  at  the  transducer  of  the  second  cell,  then 
the  travelling  wave  in  the  second  cell  can  be  described  as  proportional  to  V2(x+  Vt).  The  zeroth 
order  plane  wave  component  of  the  output  from  the  first  cell  is  given  by  equation  4  which  can 
be  rewritten  as 

\  {x,  0  =  |l  -  4[l  +  nja^{x--  Vt)\ "  I  cos(^z  -  coj)  rect(f )  (6) 

The  first  spatial  filter  passes  the  whole  of  the  zeroth  diffraction  order  and  so,  if  the  lenses  are 
large  enough  to  treat  their  apertures  as  infinite,  this  component  is  imaged  onto  the  second  cell 
where  reversal  of  the  sense  of  the  x  axis  enables  us  to  use  equation  6  unchanged  to  describe 
the  optical  input  to  the  second  cell.  The  zeroth  order  plane  wave  component  at  the  output  of 
the  second  cell  is  then 

=  |l  - 4'[l  +  m a,  (*  - “ 4[l  +  «» «1  (* - ^0] ’ } 

(7) 

Expanding  out  this  expression  produces  a  large  number  of  terms  [5],  The  full  expansion  is  not 
given  here  due  to  the  limitations  of  space.  The  terms  of  interest  are  the  first  order  terms  taking 
the  form 

{a,  (x  -  Fif)  +  (x  +  F/)}  cos{kz  -  coj)  rect(f )  (8) 

Provided  that  a](t)  and  a20)  occupy  bandwidths  of  less  than  one  octave  and  that  their  lowest 
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frequency  components  are  sufficiently  high,  the  final  spatial  filter  cmi  select  out  one  of  the  0,±1 
diffraction  sub-orders  containing  these  first  order  terms.  If  a  photodiode  placed  behind  this 
spatial  filter  is  of  sufficient  area  to  receive  the  whole  of  this  diffraction  sub-order  then  the 
spatially  integrated  intensity  of  the  light  incident  on  the  photodiode  will  contain  slowly  varying 
functions  plus  the  desired  cross  correlation  function  in  the  form 


^  ja,{x-Vl)  a^(x  + Ft)  rect(#)  dx  (9) 


Because  of  the  windowing  term  the  correlation  function  can  only  be  formed  if  the  functions 
ai(t)  and  a2ft)  are  shorter  than  the  time  window  of  the  cells  so  that  the  overlap  between  them 
will  lie  within  this  time  window  at  all  times.  Subject  to  the  restrictions  on  the  signal 
bandwidths  and  temporal  lengths,  this  relatively  simple  correlator  produces  the  complete 
correlation  function,  compressed  in  time  by  a  factor  of  two,  giving  both  amplitude  and  phase 
information.  Figure  3  shows  the  autocorrelation  function  of  a  linear  chirp  pulse  sweeping  from 
4.25  to  5.75  MHz  in  15  ps  obtained  using  the  correlator  illustrated  in  figure  1. 


Figure  3.  Autocorrelation  function  of  a  linear  chirp  pulse  produced  by  the  correlator  shown  in  figure  I. 


References 

[1]  Lee  J  N  and  VanderLugt  A  1989  Proc.  IEEE  11  1528-1557 

[2]  Reeve  C  D  and  Wombwell  J  F  1989  Proc.  lEE  F  136  185-190 

[3]  Rhodes  WT  \9U  Proc.  IEEE  69  65-79 

[4]  Houghton  A  W  and  Reeve  C  D  1992  Royal  Naval  Engineering  College  Research  Report  920'il) 

[5]  Houghton  A  W  and  Reeve  C  D  1993  Royal  Naval  Engineering  College  Research  Report  93t)16 

Copyright  ©  HMSO,  London,  1994 


Inst.  Phys.  Conf.  Ser.  No  139:  Part  III 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  lOP  Publishing  Ltd 


329 


Edge  enhancement  in  photorefractive  Joint  Transform 
Correlators 


Olivier  Daniel,  Jean-Michel  C-Jonathan  and  Gerald  Roosen 
Institut  d'Optique  Theorique  et  Appliquee  -  Unite  Associee  au  CNRS  n°  14 
Centre  Scientifique  d'Orsay,  Batiment  503, 

BP.  147,  91403,  ORSAY  CEDEX  -FRANCE 


Abstract.  We  give  an  analytical  expression  for  the  space  charge  field  induced  at  steady  state 
when  a  photorefractive  crystal,  used  in  the  diffusion  regime,  is  illuminated  by  a  non  uniform 
interference  pattern.  Its  consequences  on  the  response  of  a  photorefractive  Joint  Transform 
Correlator,  including  the  effect  of  the  reading  beam  are  predicted  and  experimentally  observed. 
The  observed  edge  enhancement  is  explained. 


1.  Introduction 

Joint  Transform  optical  Correlators  (JTC)  are  commonly  implemented  using  liquid  crystal 
Spatial  Light  Modulators  for  the  input  of  the  images  to  be  compared.  CCD  cameras  allow  a 
quadratic  detection  in  the  Fourier  plane  and  a  linear  JTC  is  thus  obtained  [1].  The  system  is 
quite  different  when  a  photorefractive  crystal  is  used. 

The  purpose  of  this  paper  is  to  demonstrate  the  specific  non  linearity  of  the 
photorefractive  response  and  show  how  it  affects  the  output  of  the  correlator. 


2.  The  photorefractive  response 

The  usual  Kukhtarev's  solution  of  the  band  transport  model  for  the  photorefractive  effect 
fails  when  the  spatial  distribution  of  illumination  varies  rapidly  on  the  crystal  or  is  two 
dimensional.  It  also  fails  to  describe  the  case  where  the  modulation  depth  of  the  interference 
pattern  locally  becomes  close  to  unity.  This  is  what  happens  in  the  experimental  conditions 
of  a  photorefractive  Joint  Transform  Correlator.  To  predict  the  steady  state  photorefractive 
response  in  these  experimental  conditions,  we  use  a  method  [2],  which  takes  into  account 
the  highly  non  uniform,  two  dimensional,  non  periodic  illumination. 

With  a  single  additional  assumption,  it  provides  an  analytical  expression  for  the 
steady  state  space  charge  field  and  an  efficient  model  in  the  specific  experimental  conditions 
of  our  photorefractive  JTC  where  the  use  of  thin  crystals  and  the  low  diffraction  efficiency 
do  not  require  beam  propagation. 


2.1.  The  photorefractive  space  charge  field 

At  steady  state,  with  no  external  field  applied  to  the  crystal,  the  induced  space  charge 
field  is  ruled  by  3  equations.  Most  important,  the  charge  transport  equation  describes  the 
equilibrium,  which  is  reached  between  drift  and  diffusion  of  the  charge  carriers.  It  is 
completed  by  the  rate  equation  and  the  Poisson's  equation  [3].  With  the  approximations, 
N^(f)«  Njj  and  n{r)«N^  (low  illumination)  (where  is  the  volume  density  of 
photorefractive  sites,  that  of  the  ionized  ones  and  equals  in  the  dark),  one  finds 
as  in  [2]: 
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[ld'^*W.E(r) 


-:^Vln(/(?)  +  /, 

e 


where  E{r)  is  the  space  charge  field  and 


^DC 


/(r)  is  the  illumination  of  the 


crystal  and  4  is  phenomenologically  added  to  describe  the  effects  of  dark  conductivity  and 
illumination  by  the  reading  beam. 

Additionally,  we  assume  that  the  variation  of  the  induced  space  charge  field  on  a 
distance  equal  to  the  Debye  screening  length  27r/fco  is  small  compared  to  the  maximum 
1  k  T 

possible  value  of  the  space  charge  field  induced  by  diffusion.  This  approximation 

1  e 

is  valid  in  a  wide  range  of  experimental  conditions.  In  the  case  of  a  sinusoidal  space  charge 
field,  it  would  only  fail  when  the  modulation  depth  gets  close  to  unity  and  the  grating  wave 

vector  k close  to  k^ .  We  thus  assume  that  m«^  instead  of  m «  1  as  in  the  Kukhtarev's 

solution.  In  BSO  where  k^  - 1.5  10^ cm"*,  this  is  verified  even  at  high  modulation  for  the 
values  of  k  corresponding  to  our  JTC.  The  derivative  in  the  denominator  of  (1)  may  then  be 
neglected  and  one  gets: 

£(?)-^V(V.£(r))  =  -^VIn(/(r)  +  /,)  (2) 


By  the  Fourier  transforming  back  and  forth,  the  space  charge  field  is  found  to  be: 
Eif)  =  -^{Vln[/(r)  +  /j|  ®  |■^exp(-<:„|r|)| 


where  ®  denotes  the  convolution. 

The  space  charge  field  is  obtained  as  a  vector.  Expression  (3)  of  the  induced  space 
charge  field,  is  valid  for  any  distribution  of  illumination.  It  discloses  the  non  linearity  of  the 
photorefractive  response,  including  how  details  smaller  than  the  screening  length  are 
smoothed  down  by  the  convolution.  The  logarithm  compresses  the  dynamics,  which  is  very 
useful  in  the  Fourier  plane.  The  derivative  then  performs  edge  enhancement  in  the  plane  of 
the  photorefractive  crystal.  The  effect  of  this  non  linear  processing  in  the  Fourier  plane  of  a 
JTC  is  described  in  the  following  paragraph. 


3.  The  photorefractive  JTC 

3.1.  Predicted  behavior 


We  now  show  the  consequences  of  relation  (3)  in  the  photorefractive  JTC.  0{r  )  and  ^(r  ) 
are  centered  at  {-X^  ,-Y^)  and  (Xo',Fo’)  ^^e  input  plane.  These  coordinates  and  the 

Fourier  transformer  define  the  carrier  spatial  frequency  k  =  j X  f .  The 


intensity  distribution  on  the  BSO  crystal  is:  /(r)  =  /o(r)[l  +  m(r)cos(/:.r  +  5)j 

with  /„(?)  =  (/,+K?)|^+K?)f),  =  and  5  =  arg[o(rV*(r)]. 


Replacing  I{r)  into  (3),  one  straightforwardly  finds  an  expression  involving  ratios  of 
trigonometric  functions  whose  Fourier  expansions  are  tabulated  [4].  Assuming  that  the 
frequencies  in  m(r)  are  small  compared  to  the  carrier  frequency  k,  they  may  be  used  to 
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obtain  a  new  expression  for  the  space  charge  field.  These  Fourier  expansions  introduce  the 
reduced  modulation  p{r)  =  — -m^(r)  - 1  j  : 


E{r) 


kj 


Vln 


m(r)l 

hir) 

y\  +  p\r) 
1 


2^  ^  p"  (r )  sin  .  ?  +  5)] 


2— —  Vp(r)^p"(r)cos[n(£f  +  5)] 


^exp(-*„|F|) 


(4) 


P(r) 

The  first  term  between  braces  gives  rise  to  the  center  peak  of  the  correlation  plane. 
The  other  ones  generate  the  usual  correlation  peak  and  its  symmetrical  counterpart.  Higher 
harmonics  are  caused  by  the  non  linearity  of  the  photorefractive  response.  The  effect  of  high 
modulation  is  described  by  p{r)  as  previously  found  in  [5].  The  last  term  is  the  correction 
brought  to  the  Kukhtarev's  model.  The  first  order  of  diffraction  responsible  for  the 
correlation  is  thus; 


E(r)  =  -2  p{r  )sin(i .  r )  -  Vp(r )  coMk .  F)}  ® 


-^exp(-^|F|) 


(5) 


and  may  be  extracted.  The  assumption  that  frequencies  in  m{r)  are  small  compared  to  the 
carrier  frequency  k,  guaranties  that  overlapping  of  orders  is  avoided.  The  low  diffraction 
efficiency  in  the  diffusion  regime  («  1%)  and  the  use  of  a  thin  crystal  thus  do  not  allow 
multiple  diffraction  that  would  couple  back  higher  order  diffracted  terms. 


The  role  of  1^  appears  here.  When  the  two  objects  are  identical,  m(r)  = 


2|o(F)f 
7,  +  2KF)r 


If  4  is  neglected,  w(r)  and  therefore  p{r)  are  identical  to  unity.  The  cosine  in  equation  (5) 
vanishes  and  the  sine  one  is  of  constant  amplitude.  The  modulated  space  charge  field 
contains  no  information  about  the  object.  A  delta  function  is  observed  in  the  correlation 
plane  at  the  position  (Xo',lJ')  of  the  object.  It  is  precisely  localized  but  the  "correlation 
peak"  bears  no  information  about  its  shape. 

Introducing  7^  broadens  that  peak  by  bringing  back  frequencies  into  pir).  It  reveals 
the  second  term,  i.e.  the  space  derivative  of  p{r). 


3.2.  Experimental  set-up  and  results 

The  geometry  of  our  experimental  set-up  is  directly  derived  from  that  described  by 
Rajbenbach  &  al  [6].  The  object  O  to  be  recognized  and  the  scene  S  where  it  is  expected  are 
introduced  in  the  input  plane  using  a  TFT  nematic  liquid  crystal  spatial  light  modulator 
illuminated  by  the  collimated  beam  of  a  diode  pumped  doubled  YAG  laser.  Their  Fourier 
transforms  o  and  s  are  formed  by  the  same  lens.  The  resulting  distribution  of  intensity  is 
recorded  as  a  modulation  of  refractive  index  by  a  BSO  crystal.  It  is  observed  using  the  670 
nm  beam  of  a  laser  diode  incident  from  the  back  at  an  angle  adjusted  to  fulfill  the  Bragg 
condition.  At  that  wavelength,  the  sensitivity  of  the  BSO  crystal  is  usually  considered  as 
negligible  compared  to  that  at  the  writing  wavelength  (532  nm),  so  that  reading  and  writing 
may  take  place  simultaneously.  The  Fourier  transform  of  the  amplitude  distribution 
diffracted  by  the  photorefractive  grating  is  observed  in  real  time  in  the  correlation  plane  Pc, 
using  a  CCD  camera. 

Figure  1  gives  experimental  evidence  for  that  prediction.  One  half  of  the  correlation 
plane  is  represented  in  the  case  of  the  autocorrelation  of  a  disk.  It  shows  (a)  a  simulation  of  a 
linear  response  correlator  and  of  a  photorefractive  JTC  from  which  the  central  order  of 
diffraction  is  removed  by  polarizers,  (b)  a  scan  through  an  experimental  "correlation  peak" 
of  our  photorefractive  JTC  . 

The  bright  central  peak  in  b)  would  be  the  only  correlation  signal  in  the  absence  of 
the  reading  beam.  In  the  real  conditions,  where  the  sensitivity  to  the  red  is  only  1/7  of  the 
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sensitivity  of  the  green  and  as  described  by  the  model,  this  peak  widens  and  is  surrounded 
by  a  weaker  ring. 


a)  Prediction 


b)  Observation 


Figure  1 :  Autocorrelation  of  a  disk 


6.  Conclusion 

The  model,  valid  at  steady  state  in  the  diffusion  regime,  provides  original  results.  We  can 
now  give  an  accurate  description  of  the  effects  induced  by  a  non  uniform  illumination  of  the 
crystal  and  predict  the  response  of  a  photorefractive  JTC  using  a  BSO  crystal  in  the  Fourier 
plane.  It  shows  that  the  photorefractive  response  is  strongly  non  linear.  This  non  linearity  is 
controlled  by  the  dark  conductivity  and  uniform  illumination  by  the  reading  beam. 
Opportunities  of  controlling  it  are  then  open.  They  are  now  being  studied. 

These  results  are  part  of  a  contribution  to  the  NAOPIA  II  Project,  funded  by  the 
Commission  of  the  European  Community  under  the  ESPRIT  Program  for  Research  and 
Development. 
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Abstract.  This  article  concerns  the  experimental  construction  and  application 
of  a  precision  optical  correlator  for  analysis  of  large  format  images  with  the 
conq)uter  processing  of  output  fields  for  fast  and  precise  determination  of 
correlation  maxima  coordinates. 


1.  Introduction 

For  the  structural  analysis  of  highly  informative  images,  for  example,  astro-  and  aerial 
photography,  we  effectively  use  optical  correlators  [1].  To  provide  a  very  high  speed  of 
processing  of  information  the  correlator  should  include  an  automatic  image  analyser,  which 
has  a  hnk  to  the  computer.  But  for  detecting  and  analysis  of  correlation  images,  a  special  unit 
based  on  a  2-D  optical  bistable  element  and  two  linear  photosensors  can  be  used. 

In  this  paper  we  describe  the  experimental  realisation  of  an  optical  correlator  with  the 
computer  processing  of  output  images  and  with  the  analyser  on  the  basis  of  an  optical  bistable 
element. 

2.  Automated  correlator  with  analyser  based  on  a  linear  scanning  photosensor  unit 

For  the  solution  of  the  task  of  precise  determination  of  coordinates  of  correlation  maxima  for 
the  analysis  of  large  format  images  (180*180  mm)  we  have  developed  a  precision  coherent 
optical  correlator  according  to  the  scheme  of  Vander  Lugt,  based  on  an  optical  Fourier 
processor  with  aperture  500  mm  and  resolution  to  100  lin/mm,  holographic  registrator  type 
FIRS  1001  with  the  working  area  40*40  mm  and  resolution  to  15001inymm,  scanning 
photoreceiving  units  Eikonix  1000  (Kodak  Co.)  with  the  resolution  of  elements  4096*4096, 
connected  with  the  IBM  PC  AT-computer. 
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The  output  image  of  the  correlation  fimction  formed  by  the  Fourier  processor,  is  read 
by  the  scanning  unit,  digitised  to  256  levels  and  put  into  the  computer  for  the  digital 
processing.  Reahsed  so,  algorithms  permit  one  to  automatically  determine  the  location  of 
correlation  peaks,  by  increasing  the  operator  threshold,  independently  of  their  dimensions  and 
form  and  with  regard  for  unevenness  of  the  background,  on  which  images  are  optically 
formed.  Algorithms  are  optimised  according  to  fast  activity  and  permit  analysis  of  one 
correlation  field  with  information  capacity  4096*4096=^8  bits  in  a  period  of  five  minutes  and 
writing  results  into  the  file  and  the  possibility  of  depicting  the  map  of  location  of  detected 
correlation  maxima  and  fi-agments  of  read  image  in  pseudocolours  on  the  monitor  of  the 
computer. 

The  Fourier  optics  for  our  correlator  was  produced  by  Vavilov  State  Optics  Institute 
(S. -Petersburg).  The  experimental  method  of  conducting  investigations  was  directed  to 
receive  maximum  sensitivity  in  locating  correlation  maxima  in  images,  which  results  in  optical 
processing  of  large  format  highly  informative  images.  The  usage  of  the  described  automated 
correlator  permits  one  to  significantly  shorten  the  time  of  analysis  of  images  (less  than  200  sec 
by  Compaq  386)  and  ensure  high  precision  and  reUabUity  of  received  results.  The  programs 
were  written  on  Borland  C++. 

3.  Correlator  with  analyser  on  the  basis  of  OBE 

The  speed  of  the  correlator  is  limited  by  the  fame  fi-equency  of  the  communication  link  to  the 
computer.  But  in  some  tasks  it  is  enough  to  estahh^  the  presence  or  absence  of  one  or  more 
correlation  maxima  in  the  output  image,  received  on  a  fi^ee  working  correlator.  In  this  case  the 
output  analyser  may  be  built  on  the  basis  of  a  two-dimensional  optical  bistable  element  (OBE) 
in  the  mode  of  a  threshold  element,  behind  which  we  place  a  collecting  lens  and  point  (one- 
element)  photoreceiver.  By  exceeding  thre^old  in  any  point  of  the  two-dimensional  OBE 
there  happens  its  commutation  into  transparent  state,  which  increases  photoelectric  current  in 
the  chain  of  photoreceiving  element.  A  similar  analyser,  equipped  with  light  dividing  prism, 
two  crossed  cylinder  lenses  and  linear  photodetectors,  connected  with  computer,  may  be  used 
for  the  determination  of  correlating  maximum  coordinates  in  the  output  image  of  the  optical 
processor.  Thereby  because  of  threshold  processing  with  the  help  of  the  OBE  the  amount  of 
information  at  the  output  is  substantially  reduced.  The  coordinates  of  the  maximum  pixel 
exceeding  the  tuning  threshold  of  the  OBE  are  easily  determined  by  the  cell  coordinates  in  the 
linear  detectors  on  which  fight  falls.  The  result  of  processing  is  given  to  the  computer,  to 
which  all  units  are  connected;  this  computer  ensures  their  synchronisation  and  makes 
decisions  on  recognition.  Arising  from  simultaneous  presence  of  some  correlating  maxima, 
uncertainty  in  their  coordinates  is  easily  removed  by  additional  testing  on  the  computer.  The 
described  analyser  can  ensure  high  speed,  because  it  interrogates  only  2N  photoreceiving  cells, 
but  not  N*N  as  in  the  case  of  using  a  two-dimensional  matrix. 

We  used  2-D  thin  film  optical  bistable  devices  for  the  experimental  analysis  of  coherent 
images,  formed  by  an  optical  correlator  at  very  high  speed.  The  unit  which  makes  this  analysis 
consists  of  one  2-D  optical  bistable  device,  two  multi- element  linear  photosensors  and  one 
cubic  beamsplitter  [2].  The  2-D  optical  bistable  devices  had  38  mm  diameter  and  were  made 
in  the  Division  of  Optical  Problem  of  Informatic  (Academy  of  Sciences  of  Belarus).  This  OBE 
has  less  than  4%  spatial  nonlinearity  of  its  threshold.  The  switching  time  is  200  ms  and 
resolution  is  30  fin /mm  This  unit  provided  a  good  signal/noise  ratio  and  we  have  today 
sensitivity  to  discover  correlation  maxima  only  to  4  times  less. 
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Abstract.  A  new  architecture  for  optical  coordinate  transformations  with  two 
phase-only  filters  is  proposed,  which  is  described  by  the  Wigner  distribution 
function.  A  precise  theory  and  experimental  results  are  presented. 


1,  Introduction 

Many  optical  coordinate  transformation  method  have  been  proposed. Bryngdahl  first 
proposed  the  use  of  computer-generated  holograms  in  coordinate  transform. Holograms  for 
scale  and  rotation  invariant  transforms  have  been  designed. Bartelt  and  Case  proposed  to 
use  a  multifacet  hologram  to  the  point-to-point  mapping.i^i  Most  of  the  methods  studied  are 
based  on  the  coherent  optical  Fourier  transformation  and  paraxial  approximation  in  the 
coordinate  transformation.  Davidson  et  al.  proposed  to  determine  the  grating  vector  of  a 
Fourier  transform  hologram.  In  these  methods,  however,  it  is  difficult  to  retain  the 
resolution  of  the  input  images. 

We  propose  here  an  alternative  non-Fourier  transform  approach,  in  which  two  phase- 
only  filters  are  used  and  the  input  plane  and  the  output  plane  of  the  system  are  conjugate 
each  other. 


2.  Phase-only  filter  design 


Consider  an  optical  system,  in  which  two  phase-only  filters  are  used  and  the  input  plane  and 
the  output  plane  are  conjugate  each  other,  as  shown  in  Fig.l.  Two  phase-only  filters 
exp[jO,  (r)]  and  exp[j02(r)]  are  positioned  at  a  distance  of  z  from  the  input  plane  and  at 
the  Fourier  transform  plane  of  a  lens  L,,  respectively.  The  Fourier  transform  of  the  first  filter 
is  obtained  in  the  second  filter  plane.  The  final  output  is  the  Fourier  transform  of  the 
transmitted  light  from  the  second  filter.  In  terms  of  paraxial  approximation,  the  optical  system 
is  generally  described  by  the  following  double  Wigner  distribution  function’.i^i 


fJJ 


hr 


*5 


r  + 


f  )*!'■“  +|]exp[-jA(v;-a,^')]d/;'dr;',  (1) 


where  h(r.,r^)  denotes  the  point  spread  function  of  the  system.  Equation  (1)  gives  a 
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relationship  between  heights  r  and  angles  a  in  the  input  and  the  output  planes.  In  the 
particular  case  of  Fig.  1,  the  optical  system  is  represented  by 

)  =  Jj  jj  JJ  J| 5{n  -  - za,  )6(fl,  -  a,  )§(/! '  -r,  )5  a, ' -a,  -  d/jda, 

xS(rj-/a,')8fa, +-^ldr|'da,’6(/i'-rj)8  a^'-a^  drjdoj 

K  J  )  V  *  J 

x5(/;-/a2')5  a^H-^jdrj'da^'.  (2) 

Suppose  the  phase  of  the  second  filter  is  written  by  the  following  form: 


We  have  the  double  Wigner  distribution  function  written  by; 

A:(/-2',aj',/i,aj)  =  6(/-j'~rj)6  a^ ' -a,  -  ^ =h[r^'-r^h  a^'-a^-js\^  .  (4) 

We  generally  consider  the  following  coordinate  transformation: 

(5) 

Then  we  have  final  phase  distributions  of  filters  as  follows: 


{ri,ai)  (/•2^2)  (ri,ai)  {r^Ao) 


Fig.  1  Optical  system  for  coordinate  transform 
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As  a  typical  example,  we  consider  the  following  coordinate  transformation: 

The  phase  distributions  of  the  filters  are  given  by 
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3.  Experiment 

We  have  designed  the  phase-only  filters  of  Eqs.  (10)  and  (1 1).  The  fabrication  of  phase-only 
filters  is  technically  difficult,  and  so  we  have  made  binary  filters  with  the  computer-generated 
hologram  technique,  as  shown  in  Fig.2.  Figure  3  shows  an  input  pattern  example  (a)  of  the 
grating  whose  pitches  are  exponentially  changed.  Because  of  the  logarithmic  coordinate 
transformation  system,  we  have  the  output  (b)  of  an  equi-spaced  grating  image. 


Fig.  2  Binary  computer-generated  holograms  for  Eqs.  (10)  and  (1 1). 
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(a) 


Fig.  3  Logarithmic  coordinate  transformation,  (a)  Input  image,  (b)  output  image. 


4.  Conclusion 

We  have  proposed  a  new  architecture  of  the  optical  coordinate  transformation  using  a  pair  of 
phase-only  filters  and  derived  the  analytical  solution  of  the  filters.  Optical  experiments  with 
computer-generated  filters  verifies  the  analytical  solution  for  logarithmic  coordinate 
transformation.  Features  of  the  architecture  are  discussed  in  comparison  with  Bryngdahl 
Fourier  transform  method. 
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Abstract 


A  new  means  of  breaking  symmetry  in  binary  phase  only  matched  filters  is  reported.  A 
randomly  pixellated  diffractive  element  is  combined  with  a  dynamic  binary  filter  displayed 
on  a  ferroelectric  liquid  crystal  spatial  light  modulator  to  produce  a  pseudo  four-level 
phase  matched  filter.  The  four  phase  levels  produced  are  sufficient  to  break  inversion 
symmetry  without  penalising  the  peak  to  noise  ratio  of  the  correlator.  A  simulated 
annealing  algorithm  is  used  to  generate  dynamic  section  of  the  filter. 


The  Binary  Phase  Only  Matched  Filter  (BPOMF)  has  been  well  documented  since  Spatial 
Light  Modulators  (SLMs)  capable  of  binary  phase  modulation  were  developed[l].  A  less 
publicised  feature  of  the  BPOMF  is  the  rotational  symmetry  due  to  the  binary  phase  levels 
of  0  and  tt.  The  Hermitian  impulse  response  of  the  BPOMF  means  that  the  filter  cannot 
discriminate  between  the  reference  and  the  same  reference  rotated  by  180°.  For  most  pattern 
recognition  tasks,  the  inherent  rotational  invariance  is  an  advantage,  but  for  some  applications 
such  as  distinguishing  between  the  characters  d  and  p,  the  property  can  lead  to  incorrect 
decisions.  In  an  application  such  as  road  sign  recognition,  triangular  signs  have  very  different 
implications  for  each  orientation.  An  asymmetric  BPOMF  is  thus  desirable.  Symmetry  can 
be  broken  by  increasing  the  number  of  phase  levels  from  two  to  four.  Previous  work  has  shown 
that  two  binary  SLMs  in  a  Mach  Zehnder  interferometer  geometry[2]  can  be  used  to  display 
the  filter,  but  this  technique  is  expensive  and  complicates  the  optical  design.  Recently,  a 
pseudo  four  level  phase  system  has  been  used  to  break  the  symmetry  in  computer  generated 
holograms  displayed  on  a  single  binary  phase  SLM[3].  In  this  letter  we  show  that  such  an 
approach  can  be  used  with  great  advantage  in  the  BPOMF.  With  filters  displayed  on  a  single 
SLM  we  demonstrate  that  it  is  possible  to  break  the  BPOMF  symmetry  without  penalising 
the  peak  to  noise  ratio. 

The  correlator  is  a  classical  4/  Vander  Lugt  BPOMF.  Immediately  following  the  filter  SLM, 
(in  binary  phase  mode)  is  a  Pixellated  Diffractive  Element  (PDE)  which  is  used  to  create  a 
pseudo  four  level  system.  The  PDE  has  the  same  pixel  pitch  as  the  filter  SLM  and  is  aligned 
pixel  to  pixel  with  the  SLM.  Each  pixel  on  the  random  phase  array  is  set  to  either  0  or  7r/2 
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Figure  1:  Simulation  of  the  pseudo  four  phase  level  system  applied  to  a  triangular  road  sign 


and  is  fixed.  Each  pixel  on  the  SLM  can  be  set  to  0  or  tt  and  is  dynamic.  The  combination 
of  the  array  and  the  SLM  means  that  phase  levels  from  either  [0  tt]  or  [k  j2  37r/2]  can  be 
selected  depending  on  the  pattern  on  the  SLM.  Ideally,  the  phase  array  would  be  physically 
etched  onto  the  glass  surface  of  the  SLM,  but  for  initial  demonstration,  we  have  constructed 
a  photoresist  pattern  on  a  A/10  optical  flat  to  generate  the  7r/2  phase  step. 

The  BPOMF  was  generated  by  a  simulated  annealing  algorithm[4,  5]  which  was  adapted  to 
include  the  random  phase  array  in  the  calculation.  An  ideal  correlation  (delta  function)  was 
used  as  the  target  for  the  algorithm  and  a  plane  of  zeros  used  as  the  target  for  the  rejection 
criteria.  An  87x56  pixel  triangular  warning  sign  sampled  from  the  U.K.  Highway  Code  was 
used  as  the  input  to  be  recognised  and  the  rotated  sign  used  for  rejection.  Figure  1  shows  a 
simulation  of  the  pseudo  four  phase  filter  with  a  7r/2  phase  step  and  a  triangular  road  sign 
input  in  both  orientations.  Although  the  PDE  constrains  the  design  of  the  matched  filter, 
there  is  sufficient  phase  freedom  in  the  system  for  the  filter  to  optimise  very  efficiently.  The 
.suppression  of  the  symmetric  order  for  an  ideal  phase  step  of  7r/2  was  16dB  and  there  was  no 
change  in  the  peak  to  noise  ratio,  27.5dB,  when  compared  to  a  symmetric  BPOMF  generated 
by  the  same  algorithm.  For  a  phase  step  of  more  than  7r/2,  the  suppression  was  reduced,  but 
is  still  sufficient  for  good  discrimination.  The  PDE  constructed  had  a  delay  of  2.09rad,  the 
symmetric  suppression  was  reduced  to  5.1dB  but  the  peak  to  noise  ratio  remained  at  27.5dB. 

A  chrome  on  glass  mask,  with  128x128  pixels  {220/im  pitch  matching  the  filter  SLM),  was 
made  containing  a  totally  binary  random  pattern.  The  mask  was  then  used  to  to  transfer 
the  pattern  onto  a  layer  of  photoresist  spun  onto  an  optical  flat.  The  flat  was  then  developed 
leaving  the  raised  pattern  in  photoresist.  The  exact  thickness  of  the  photoresist  determines 
the  retardation  of  the  step.  A  step  size  of  247nm  was  required  for  7r/2  at  A  =  633nm.  The 
thinnest  photoresist  layer  achieved  was  318nm  using  Shipley  81400- 17  spun  at  4000rpm  for  40 
seconds,  which  creates  a  phase  delay  of  2.09rad  at  633nm  which  is  sufficient  to  demonstrate 
the  asymmetric  properties. 


The  experimental  layout  is  a  modified  4/  correlator  shown  in  Figure  2.  The  SLMs  used  were 
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Figure  2:  Experimental  layout  of  the  modified  4/  BPOMF 


Figure  3:  Experimental  BPOMF  correlation  plane  for  the  correct  and  mirrored  triangular 
roadsign 

128x128  pixel  ferroelectric  liquid  crystal  devices  from  THORN  EMI  CRL.  The  pixel  pitch 
was  220//m  for  both  the  input  and  filter  SLM.  The  three  lens  system  in  Figure  2  was  used  to 
compress  the  effective  length  of  the  transform  lens,  whilst  keeping  the  wavefront  distortion 
to  below  one  wavelength,  The  PDE  was  mounted  as  close  to  the  SLM  analyser  as  possible 
to  minimise  the  effects  of  diffraction  through  the  pixels.  Figure  3  shows  the  results  of  the 
experiment.  The  symmetry  suppression  was  limited  by  the  noise  in  the  output  plane  to  4.8dB 
which  is  sufffcient  for  good  discrimination  between  the  two  rotated  images.  The  noise  was 
mainly  due  to  nonuniformities  in  the  photoresist  layer  of  the  PDE  and  diffraction  across  the 
gap  between  the  analyser  and  the  PDE. 

We  have  presented  a  new  means  of  suppressing  the  symmetric  correlation  peaks  in  the 
BPOMF  without  requiring  an  extra  SLM  or  penalising  the  peak  to  noise  ratio  in  the  correla¬ 
tion  plane.  Performance  would  be  further  improved  by  constructing  an  SLM  with  the  random 
7r/2  phase  pattern  accurately  etched  onto  the  glass  surface  of  the  SLM  cell.  Furthermore,  as 
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the  BPOMF  is  generated  by  simulated  annealing,  it  is  possible  to  train  it  for  applications 
which  require  other  invariant  properties  whilst  suppressing  the  symmetry. 

The  authors  would  like  to  thank  Dr  J.  Brocklehurst  and  M.  Birch  of  CRL  for  providing  the 
SLMs  and  Dr  D.C.  O’Brien  for  many  useful  discussions.  The  work  was  partially  supported 
by  ESPRIT  project  number  7050,  HICOPOS. 
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Abstract.  We  show  the  possibility  of  implementing  a  cellular  automaton  called  Life 
by  means  of  an  optical  correlator.  Since  this  cellular  automaton  is  an  universal 
computer  there  is  no  inherent  limitation  in  the  information  processing  capabilities  of 
correlators. 


1.  Introduction, 

Optical  correlators  are  probably  the  most  popular  and  well-studied  devices  used  in 
optical  pattern  recognition.  They  are  applied  to  a  wide  variety  of  problems  including  those 
requiring  multiple  invariances.  However,  there  is  no  guarantee  that  a  particular  problem  can 
be  solved  by  means  of  an  optical  correlator,  and  therefore  it  is  a  legitimate  question  whether 
they  can  be  considered  as  general  purpose  classifier  devices. 

The  main  difficulty  of  single  filter  correlations  is  the  limited  number  of  images  that 
can  be  stored  in  the  filter  thus  limiting  the  complexity  of  the  problems  that  can  be  solved. 
These  limitations  have  been  analyzed  in  Ref.l.  A  possible  way  to  overcome  this  drawback 
is  the  use  of  multichannel  correlators  which  enable  us  to  obtain  arbitrarily  complex 
processings  [1,2]. 

In  this  communication  we  show  that  it  is  possible,  at  least  in  theory,  to  use  iterative 
techniques  as  an  alternative  to  multichannel  schemes  to  solve  the  inherent  limitations  of  single 
channel  correlators.  We  propose  an  iterative  process  involving  a  single-channel  correlation 
processed  by  a  simple  nonlinear  function.  This  process  enables  us  to  optically  implement  a 
cellular  automaton,  called  Life,  with  powerful  capabilities  to  process  information. 
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2.  Cellular  automata. 

Cellular  automata  are  usually  described  as  a  regular  infinite  array  of  cells  that  can 
take  one  of  several  allowed  states.  Automata  evolve  in  discrete  time  steps  (generations) 
governed  by  deterministic  rules.  The  state  of  a  given  cell  in  the  generation  k+ 1  is  a  function 
of  its  previous  state  (that  at  generation  k)  as  well  as  of  the  states  of  its  neighbors  (also  at 
generation  k).  The  updating  of  the  cells  is  performed  synchronously,  that  is  all  at  the  same 
time.  Some  of  these  cellular  machines  exhibit  a  behavior  complex  enough  to  support 
universal  computation  [3].  They  are  called  class-IV  automata  and  one  of  the  most  studied  is 
the  game  of  Life  [4].  The  possibility  of  being  implemented  by  correlation  leads  to  the 
conclusion  that  a  correlator  is  as  powerful,  in  its  information  processing  capabilities,  as  the 
most  powerful  machines  known  (or  for  those  who  accept  the  Church-Turing  thesis  as  the 
most  powerful  machines  that  can  exist),  namely  universal  computers. 

3.  Optical  implementation  of  Life. 

The  cells  of  Life  can  be  in  one  of  the  two  possible  states,  dead  or  alive.  The 
transitions  between  these  two  states  are  controlled  by  the  states  of  its  eight  neighboring  cells 
following  these  basic  rules: 

-  A  cell  that  is  sdive  at  time  k  continues  living  at  time  k+ 1  if  and  only  if  it  has  2  or 
3  living  neighbors. 

-  A  cell  that  is  dead  at  time  k  becomes  alive  at  time  k  4*  1  if  it  has  exactly  three  living 
neighbors. 

From  these  rules  it  is  evident  that  the  state  of  any  cell  is  determined  by  the  total  sum 
(assuming  one  for  live  cells  and  zero  for  the  dead  ones)  of  the  values  of  its  neighbors,  and 
does  not  depend  on  their  individual  values.  This  property  is  called  totalistic  and  enables  a 
simple  optical  implementation.  By  correlating  an  input  Life  pattern  with  the  filter  represented 
in  Fig.  1  and  processing  the  output  intensity  distribution  by  the  function  f(x)  =  rect  (x/10  -1) 
we  obtain  the  next  generation. 
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Figure  1.  a)  Filter  in  object  space,  b)  Filter  in  Fourier  space. 
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As  an  example  of  the  computing  capabilities  of  the  automaton  we  show  the 
construction  of  an  AND  gate.  The  correlation  was  simulated  by  using  the  Fast  Fourier 
Transform  (FFT)  algorithm  over  an  array  of  128x128  cells.  The  input  pattern  is  that  shown 
in  Fig.  2.a  where  the  two  inputs  are  labelled  by  the  letters  A  and  B.  The  signals  are  carried 
by  patterns  formed  by  five  cells  called  gliders  (labelled  with  ones)  which  move  one  position 
diagonally  each  four  generations.  The  presence  or  absence  of  a  glider  represents  the  bit  1  or 
0  respectively.  The  structure  at  the  bottom  left  corner  is  a  glider  gun,  a  periodic  pattern  that 
emits  a  glider  after  30  generations.  With  the  proper  timing  and  alignment  the  collision 
(indicated  by  V)  of  two  gliders  produces  a  vanishing  reaction  which  annihilates  both  patterns, 
as  shown  in  Fig.  2.b.  The  logic  operation  is  based  on  these  collisions. 
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Figure  2.  Life  AND  gate,  a)  The  desired  operations  are  AND  (A=  1,B=  1)  and  AND  (A=0,B- 1). 
b)  The  first  glider  of  channel  A  opens  a  hole  in  the  stream,  c)  The  first  glider  of  channel  B  pass 
while  the  solitary  second  glider  of  channel  B  is  destroyed,  d)  The  eater  eliminates  the  residual  gliders 
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When  there  is  a  glider  in  both  channels  (A  and  B)  the  collision  of  one  of  them  opens 
a  hole  in  the  stream  produced  by  the  glider  gun,  which  is  profited  by  the  second  to  pass 
through  (Fig.  2.c).  When  there  is  only  one  glider  it  is  eliminated  by  the  stream  (Fig.  2.c) 
and  therefore  by  detecting  the  presence  or  absence  of  gliders  at  the  position  indicated  in  Fig. 

2.d  at  predetermined  times  we  obtain  the  output  of  an  AND  gate.  The  rest  of  the  stream 
produced  by  the  glider  gun  is  eliminated  by  the  eater  (the  collision  C  shown  in  Fig.  2.d)  to 
avoid  interferences  with  other  structures. 

As  can  be  observed  the  repeated  application  of  the  simple  rules  that  govern  the 
evolution  of  Life  patterns  produces  a  very  rich  behaviour;  the  automaton  permits  the 
construction  of  travelling  structures  (the  gliders)  to  send  signals,  these  gliders  can  be 
eliminated  by  colision  with  other  gliders  or  with  stationary  patterns  (the  eater),  etc.  Based 
mainly  on  these  patterns  we  can  perform  a  complete  imitation  of  an  universal  computer,  that 
is  we  can  build  different  logic  gates  and  interconnect  them  to  build  memory  elements  for 
example  [4].  As  a  consequence,  the  iterative  correlation  process  by  which  we  have 
implemented  Life,  can  be  used  to  solve  arbitrarily  complex  mathematical  problems  such  as 
those  involved  in  pattern  recognition.  However,  it  is  not  necessary  to  codify  pattern 
recognition  problems  as  Life  structures  to  solve  them.  We  have  used  the  game  only  to  show 
that  the  proposed  architecture  is  free  of  the  a  priori  restrictions  of  single-filter  correlations. 
This  is  the  usual  way  to  show  the  equivalence  of  computational  power  between  different 
architectures  of  computing  devices.  Therefore  it  is  necessary  to  find  the  practical  form  to 
take  profit  of  this  framework  by  designing  specific  processes  that  permit  to  solve  a  particular 
pattern  recognition  problem. 
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Abstract.  The  use  of  Minimum  Average  Correlation  Energy  (MACE)  and  Minimum  Noise  and 
Correlation  Energy  (MINACE)  filters  for  the  recognition  of  defocused  images  is  studied.  We  propose 
to  adapt  the  filter  simultaneously  to  the  original  and  to  the  blurred  images  to  improve  the  recognition 
process.  In  this  way  the  value  of  the  correlation  peak  is  stable  for  a  wide  range  of  blurring  in  the  input 
image. 


The  input  image  in  an  optical  correlator  may  be  degraded  by  different  sources  of 
blurring.  A  common  degradation  is  defocus.  Usually  the  defocusing  will  correspond  to  an 
image  acquisition  process  which  is  essentially  incoherent,  but,  for  optical  processing 
purposes,  the  blurred  images  will  always  be  the  input  of  a  coherently  illuminated  correlator. 

In  the  recognition  process  of  real  images  we  can  distinguish  two  different  cases.  In 
a  first  situation  the  blurring  of  the  whole  scene  containing  different  images  is  fixed,  so 
simply  the  ratio  of  the  corresponding  correlation  peaks  may  be  a  good  criterion  for  a  final 
thresholding.  A  second  case  corresponds  to  different  blurrings  for  the  different  images.  Here, 
to  apply  a  threshold  it  is  necessary  that  the  peak  value  does  not  change  with  defocusing. 

We  will  consider  only  the  second  case,  therefore  we  will  be  interested  in  the 
maximum  stability  of  the  peaks  and  in  keeping  the  difference  between  the  correlations  of  the 
target  and  the  rejected  images.  We  need  differences  high  enough  to  allow  a  thresholding  in 
the  correlation  plane  to  classify  the  objects. 

The  theoretical  performances  of  several  correlation  filters  like  the  Classical  Matched 
Filter,  Phase  Only  Filter  and  Inverse  Filter  have  been  previously  studied  [1],  Moreover,  their 
implementation  in  convergent  correlators  [2]  or  in  a  joint  transform  architecture  [3]  has  been 
recently  studied. 

One  possibility  to  improve  the  performance  of  the  correlation  method  is  the  use  of 
Minimum  Correlation  designs  such  as  the  MACE  and  MINACE  filters[4].  With  these  designs 
it  is  possible  to  control  the  correlation  at  the  origin  with  different  input  objects  (the  training 
set).  This  allows  to  include  the  blurred  images  as  part  of  the  training  set  and  therefore  to 
obtain  defocusing  invariant  filters.  We  will  show  that  this  is  a  useful  technique  to  improve 
the  performance  of  a  MACE  or  MINACE  filter  for  a  wide  range  of  defocused  scenes. 


Figure  1.  Test  objects  used  in  the  experiments. 

Figure  1  shows  the  original  (sharp)  object  to  be  detected  together  with  the  two  blurry 
versions.  We  show  also  the  two  objects  to  be  rejected.  The  butterflies  are  54x64  pixels  in 
size,  zero-padd^  to  a  total  of  128x128  pixels.  The  Point  Spread  Functions  of  the  blurring  are 
approximated  by  circles  of  a  diameter  d  =  l  (non  defocused),  4.16  and  20.8  pixels. 

Figures  2  and  3  represent  the  values  of  the  correlation  intensity  peaks  (obtained  with 
the  different  filters)  as  a  function  of  the  defocusing.  In  both  figures,  the  lines  with  * 
correspond  to  a  filter  designed  to  give  a  central  correlation  value  1  with  the  non-defocused 
target  and  0  with  the  two  non-targets.  Although  the  discrimination  capability  is  good,  we  see 
that  the  correlation  value  decreses  rapidly  as  the  defocus  increases. 

Since  the  MACE  and  MINACE  filter  design  methods  allow  multiple  target  correlation 
conditions,  we  can  use  a  defocus^  object  as  one  of  the  targets  and  we  can  impose  the  central 
correlation  value  with  the  non  defocused  and  with  the  defocusol  target  to  be  the  same.  The 
lines  with  'o'  correspond  to  filters  designed  to  give  the  same  central  peak  for  the  non- 
defocused  and  maximum  defocused  targets  (diameter  =  20.8  pixels).  Note  the  good  stability 
for  intermediate  degradations.  Lines  with  'x'  are  for  non-defocused  and  slight  defocusing 
(diameter  =  4.16  pixels).  In  this  case  the  performances  also  decay  as  blurring  increases.  In 
all  cases  the  MACE  filters  are  less  resistant  than  MINACE  ones. 

This  lack  of  resistance  to  defcKusing  in  the  MACE  design  when  no  blurred  images  are 
included  in  the  training  set,  is  due  to  the  high  values  of  the  filter  components  corresponding 
to  frequencies  of  low  energy  in  the  training  images.  Since  it  can  be  assumed  with  wide 
generality  that  these  low  energy  components  are  those  of  high  frequency,  the  MACE  filter 
will  be  very  sensitive  to  even  small  changes  in  the  high  frequency  comjxinents  of  the  input 


Figure  2.-  MACE  Filter  Figure  3.-  MINACE  Filter 

Correlation  intensity  peaks  versus  defocusing  obtained  with  Mace  and  Minace  filters. 
Target  is  butterfly  Bl.  Cross-correlation  with  the  other  butterflies  are  also  given. 
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Figure  4  Figure  5 

Cross-section  of  B1  power  spectrum,  4)  sharp  image  and  5)  defocused  image  (d=20.8). 

images.  On  the  other  hand,  as  we  can  see  in  figures  4  and  5,  the  defocusing  of  an  image 
mainly  affects  to  its  high  frequency  components  (the  effect  can  be  approximated  to  a  low-pass 
filter)  being  the  error  largely  amplified  and  thus  producing  significative  variations  in  the 
correlation  peaks. 

The  MINACE  filter,  although  specifically  designed  for  noise  resistance,  shows  better 
performances  due  to  the  limitation  imposed  to  the  maximum  value  of  a  filter  component 
corresponding  to  a  low-energy  frequency.  As  expected,  the  autocorrelation  values  of 
MINACE  filters  are  higher  than  the  same  autocorrelations  corresponding  to  MACE  filters  for 
the  same  training  set. 

Figures  6  shows  3D  views  of  different  correlations  in  the  case  of  MACE  filters.  In 
figures  a),  and  b)  the  filter  is  designed  using  the  non-blurred  objects.  In  c)  and  d)  the  blurred 
target  (PSF  of  20.8  pixels)  is  also  included.  These  correlations  are  obtained  with  non  blurred 
(figures  6a,  and  6c)  and  with  blurred  butterflies  (figures  6b,  and  6d)  Bl,  B2  and  B3  of  20.8 
pixels  in  each  picture.  Notice  the  stability  of  the  peaks  when  the  blurred  target  is  included  in 
the  training  set 

In  figure  7  we  see  the  peak  to  correlation  energy  (PCE)  of  the  different  MACE  and 
MINACE  filters.  For  MACE  filters,  the  PCE  is  quite  constant  within  the  range  considered 
for  their  design,  decreasing  rapidly  outside.  For  MINACE  filters,  the  behaviour  is  more 
stable  with  respect  to  defocusing. 

Figure  8  shows  the  Signal-to-Noise  Ratio  for  all  the  filters.  In  all  cases  the  SNR  is 
stable  for  the  defocusing  range  used  in  the  design.  Outside  this  range  the  SNR  degrades 
exponentially.  As  expected  because  of  the  design,  the  SNR  with  MINACE  filters  is  higher 
than  that  of  MACE  filters. 

In  conclusion,  we  can  use  MACE  or  MINACE  filters  designed  with  blurred  images 
in  the  training  set  in  order  to  provide  resistance  to  defocusing  in  our  identification  system 
even  in  the  most  restrictive  case  of  different  defocus  for  each  captured  image.  We  have  found 
that  it  is  enough  to  include  the  original  and  the  maximum  degraded  images  to  obtain  good 
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stability  in  all  the  range.  If  we  use  MINACE  filters  we  add  noise  tolerance,  higher  stability 
and  we  obtain  more  light  efficiency. 


Figure  6.-  Correlation  intensities  with  the  three  butterflies.  A  MACE  filter  is  designed  to 
give  the  following  responses:  a),  b)  Bl  =  l,  B2=0,  B3=0;  c),  d)  Bl  =  l,  B2=0,  B3=0, 
Bl  =  l(defocused  with  d=20.8).  a),  c)  sharp  scene,  b),  d)  scene  defocused  with  d=20.8. 


Figure  7  8 

Lines  with  (*)  correspond  to  MACE  filters.  Continuous  lines  are  for  the  case  when  only 
the  sharp  target  is  used.  Long  dashed  lines  use  also  the  d=4.16  blurred  target,  and  short 
dashed  lines  correspond  to  using  the  sharp  and  the  most  defocused  target  (d=20.8).  The 
same  convention  applies  to  MINACE  lines. 
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Abstract.  An  optical  system  for  the  Haar  wavelet  transform  of  binary  images 
based  on  a  time-division-multiplexing  technique  is  described.  We  show  that  a  simple 
parallel  image  differentiator  can  handle  the  bipolar  wavelet  function  efficiently  in 
the  system. 


1.  Introduction 

Recently,  various  optical  architectures  of  the  wavelet  tran.sform  (WT)  have  been  inves¬ 
tigated  to  get  fast  transform  results  by  taking  advantage  of  parallelism  of  optics  [ij. 
Following  Yang  et  al  [2],  we  performed  experiments  on  the  optical  Haar  WT  using  a 
shadow  casting  system.  To  handle  bipolar  nature  of  the  wavelet  functions,  Yang  et  al. 
used  a  polarization  encoding  technique  in  their  shadow  casting  system  with  a  coherent 
light  source.  Through  a  properly  rotated  analyzer,  the  phase  difference  between  the 
fights  from  the  positive  and  negative  parts  of  the  wavelet  hinction  becomes  tt,  by  which 
the  amplitude  subtraction  is  obtained  in  their  experiment.  However,  we  could  not  ob¬ 
tain  good  amplitude  subtraction  results  because  the  sum  of  two  coherent  beams  with  a 
phase  difference  tt  still  interfered  constructively  at  higher-order  i:>ositions  in  the  output 
focal-plane. 

To  represent  bipolar  images  in  incoherent  systems,  area  encoding,  wavelength  mul¬ 
tiplexing,  and  time  division  multiplexing  (TDM)  techniques  have  been  commonly  used 
[3]- [5].  In  these  schemes,  a  separate  subtraction  process  is  required  after  the  two  outputs 
of  the  different  signs  are  obtained.  Here  we  adopt  the  TDM  techni<pie  and  show  that  a 
simple  parallel  image  differentiator  can  be  used  efficiently  for  the  subtraction. 
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2.  The  Haar  WT  based  on  TDM  technique 

The  Haar  WT  of  a  2-D  image  s(x,y)  is  matliematically  described  by 

Wip,,,a,h)  =  (1) 

where  h{x,y)  is  the  mother  Haar  wavelet  function,  a  and  h  are  dilation  factors,  and  p 
and  q  are  shifting  factors.  Let  us  denote  the  positive  and  negative  parts  of  the  wavelet 
function  and  Jr{x,y),  respectively.  That  is 

=  rect(.T -0.5,?/- 0.5) +rect(.T  + 0.5,7/ T  0.5) 

Jr(x,y)  =  -rect(.T  -f  0.5,?/  -  0.5)  -  rect(.'r  -  0.5,  y  +  0.5).  (2) 

Thus  h  =  h'^  +  h~ . 

The  Haar  WT  based  on  TDM  technique  means  that  /?+  and  |/i“|  are  presented 
alternately  during  2jt,,  <  t  <  (2j  +  1)/^  and  (2j  +  l)t„  <  t  <  (2j  +  2)t,„  respectively. 
;■  =  0,  1,  2,  •  •  •,  and  t,,  is  the  flipping  time  interval  of  the  positive  and  negative  wavelet 
patterns.  Thus  the  TDM  ver.sion  of  Eq.  (1)  can  be  written  as 

Wtdm(p,  <h  «,  K  t)  =  <h «,  '>)  E  ["(*  -  -  ''(<  -  (2j  +  1)<»)1 

j 

+  W-{p,q,aJ,)J2Hf-{'23  +  l%,)-u{t-{2j +2)t,.)]  (3) 

3 

where  W^(p,q,(i,h)  =  (1/ V^)  / / -p)Ab(?/  “  q)lh)\(lxdy  and  u(t)  is  the 
Heaviside  step  function.  Note  that  Wjdm  is  always  unipolar  if  .s(.t,?/)  is  a  unipolar 
function. 

To  obtain  the  Haar  WT  result,  W(p,q,aJ)),  the  subtraction  -  W~  should  be 
executed.  We  can  get  this  subtraction  by  differentiating  Wtdm  of  Eq.  (3)  in  time 
domain,  Le., 

=  W+(p,q,aJ,)m  +  -  W-{p,q,a,h)W  -  (4) 

A— 1 

where  S{f)  is  the  Dirac  delta  function.  Practically,  L  =  1,  2  is  enough.  Eq.  (4)  implies 
we  can  get  the  magnitude  of  W+  -  W~  by  detecting  the  pulse  energy  at  every  output 
position  (p,q)  after  t  >  0.  Note  that  there  is  no  ideal  differentiator  and  thus  ideal 
delta  function.  A  practical  differentiator  is  the  high  pass  filter.  In  this  case,  the  pulse 
amplitude  is  proportional  to  |W+  -  iF“|  in  Eq.  (4).  The  sign  of  subtraction  result  can 
be  also  o])tained  if  the  pulse  polarity  is  detected. 


3.  Optical  Haar  WT  system  based  on  TDM  technique 

Our  system  that  implements  Eq.  (3)  and  (4)  is  shown  in  Figure  1.  The  first  part  that 
calculates  Wtdm  is  a  shadow  casting  system  whose  structure  is  similar  to  that  in  [2]. 
The  pattern  for  and  |//“|,  which  are  flipping  alternately,  are  represented  by  the  light 
transmittance  of  a  spatial  light  modulator  (SLMl).  A  sequence  of  discrete  shifting  factors 
(PtnMIn)-  wh(ue  m  and  n  are  integers,  is  represented  by  the  positions  of  the  point  sources. 
The  dilation  factors  a  and  h  can  l)e  modified  by  moving  the  position  of  the  flipping 
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Figure  1:  The  optical  Haar  WT  system  based  on  TDM  technique. 

wavelet  pattern  (SLMl)  along  the  optical  axis.  This  movement  enlarges  or  shrinks  the 
shadow  size  of  the  wavelet  pattern  according  to  the  moving  directions.  Thus  for  a  given 
size  of  the  wavelet  pattern,  a  and  is  detected  and  fed  to  SLM2  in  the 

image  differentiator  or  novelty  filter  (NF)  [6], [7]. 

Either  two-beam  coupling  or  four-wave  mixing  NF  can  be  used  for  the  image  dif¬ 
ferentiator.  The  two-beam  coupling  NF  shown  in  the  lower  part  of  Figure  1  is  simple 
and  works  with  not  only  phase-modulated  but  also  intensity-modulated  images.  The 
light  intensity  of  the  pump  beam  is  much  higher  than  that  of  the  image-bearing  signal 
beam.  In  the  photorefractive  crystal  (BaTiOa),  holographic  gratings  are  generated  by 
these  two  beams.  The  crystal  is  oriented  so  that  the  signal  beam  is  strongly  deamplified. 
Physically,  this  takes  place  because  the  diffracted  beam  by  the  hologram  from  the  pump 
beam  into  the  the  direction  of  the  signal  beam  interferes  destructively  with  the  directly 
passing  signal  beam.  In  other  words,  the  diffracted  beam  and  the  signal  beam  are  nearly 
equal  field  intensity  and  tt  out  of  phase  with  each  other  in  the  steady  state  [6], [7].  Thus 
they  tend  to  cancel  out. 

Suppose  some  portion  of  the  input  signal  beam,  for  example,  W'^{p„i^qn)  changes 
suddenly  into  f/n)  at  i  =  (2j  +  l)t„.  The  hologram  in  the  crystal  cannot  follow 

this  change  instantly  because  of  the  photorefractive  time  constant.  Therefore,  only  the 
amount  of  beam  intensity  that  is  not  cancelled  by  the  diffracted  beam  from  the  pump 
beam,  \W'^{pm,qn)  —  '^~(Ptn-,qn)\y  appears  in  the  image  plane  of  the  output  signal  beam 
at  that  instant.  Since  there  is  no  fiirther  variation  in  the  signal  beam  until  next  t(, 
duration,  new  gratings  are  formed  after  a  time  lapse  of  about  a  photorefractive  time 
constant.  The  destructive  interference  is  regained.  An  example  of  this  circumstance  is 
depicted  in  Figure  2(a).  The  time  interval,  should  be  greater  than  the  photorefractive 
time  constant.  The  way  of  our  system  works  is  depicted  in  Figure  2(b)  when  the  input 
unipolar  function  s{x,y)  is  a  rectangular  image. 

4.  Discussion  and  Conclusion 

Both  the  two-beam  coupling  and  four-wave  mixing  NF’s  detect  only  the  magnitude  of 
I  IF (p„, ,  (/„ )  I  like  other  optical  wavelet  transform  systems  do  [1],  because  they  produce 
only  unipolar  light  pulses  whenever  Wtdm  changes  suddenly.  To  detect  both  the  sign 
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(a)  (b) 


Figure  2:  An  example  of  the  way  our  system  works  (gedanken  experiment),  (a)  The 
action  of  two-beam  coupling  differentiator  at  an  output  position  (pmi<?n);  (6)  The  Haar 
wavelet  transform  result  for  an  input  of  unipolar  rectangular  image. 

and  magnitude  of  PF,  a  beam  fanning  differentiator  may  be  used,  which  produces  positive 
light  pulses  only  when  Wtdm  increases  [6], [8].  Therefore,  if  the  pulses  are  generated  at  t 
=  kto  of  even  (odd)  k,  the  sign  of  W  is  positive  (negative).  In  this  case  the  photorefractive 
crystal  should  be  placed  near  the  image  plane  of  W^dm  so  that  the  beam  fanning  can 
occur  independently  for  every  beam  spot  of  (pm,<2n)-  In  general,  the  speed  of  beam 
fanning  depends  on  the  input  beam  intensity.  Thus  one  disadvantage  of  this  scheme 
will  be  that  should  be  long  enough  for  the  beam  fanning  to  take  place  sufficiently  at 
possible  weak  beam  spots. 

The  system  in  Figure  1  is  not  a  fully  parallel  processor,  because  of  the  CCD  detector 
and  SLM2,  although  their  speed  is  a  real-time  video  rate.  They  can  be  replaced  with  a 
parallel  photodetector-SLM  module,  for  example,  a  licpiid  crystal  light  valve. 

The  time  needed  to  obtain  |PF(pm,^yn)|  with  this  optical  system  is  t,,,  roughly  speak¬ 
ing,  the  photorefractive  time  constant  that  is  determined  by  the  strong  pump  beam 
intensity.  It  is  noteworthy  that  this  time  constant  is  independent  of  the  size  of  the  input 
image  .s(.t,7/)  and  the  number  of  point  sources  in  the  array  {pm,(ln)- 

In  conclusion,  we  explained  an  optical  shadow  casting  system  for  the  Haar  wavelet 
transform  based  on  TDM  technique.  We  showed  that  a  simple  image  differentiator  using 
a  photorefractive  crystal  can  handle  the  bipolar  wavelet  function.  Thus  a  serial  subtrac¬ 
tion  process  related  with  the  bipolar  nature  of  the  wavelet  function  can  be  eliminated 
efficiently. 
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Fractional  Fourier  transforms,  imaging  systems  and 
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Abstract.  Fractional  Fourier  transforms  of  integer,  fractional  and  complex 
degrees  can  be  implemented  by  lenses.  Optical  units  that  realize  such  basic 
transformations  are  shown  to  be  useful  in  the  implementation  of  more  complex 
optical  systems  such  as  imaging  systems  and  correlators. 

1,  Introduction 


Recently,  Lohmann  [1]  has  shown  that  fractional  Fourier  transforms  (FRFTs),  defined  on  the 
basis  of  Wigner  functions,  can  be  optically  implemented  by  lenses,  after  generalizing  the 
usual  single-lens  and  two-lenses  geometries  used  for  the  implementation  of  the  conventional 
Fourier  transform.  The  Lohmann's  analytical  definition  is  based  on  collimated  illumination. 
A  more  general  definition  for  divergent  and  convergent  illumination  can  be  written  for  both 
single-lens  and  two-lenses  geometry  configurations  [2]. 


2.  Implementation  of  FRFTs  with  lenses 

Figure  1(a),  indicates  the  geometry  for  the  single-lens  configuration  of  FRFTs.  The  variables 
Zj  and  f  affect  the  value  of  the  degree  p  of  FRFT,  F(p),  through  the  expressions: 


.ym(t)  =  yV^jf2/-Zi; 


cos<^ 


/ 


(1) 


where  (^=pK/2.  The  angle  (j)  and  therefore  p  can  be  complex  numbers.  In  this  case,  the 

sinusoidal  functions  can  be  expressed  in  terms  of  sinusoidal  and  hyperbolic  functions  of  real 
arguments.  The  scale  factor  of  the  transform,  fp,  depends  on  zj,  f  and  the  point  source 

distance  p  to  the  object  plane  either  for  divergent  (p>0)  or  convergent  (p<0)  illumination, 
through  the  expression: 


/p- 


P/ 

P  +  ^i-/ 


(2) 


The  distance  Z2  of  the  plane,  where  F(P)  is  formed,  to  the  lens  depends  on  p,  f  and  zj 
according  to  the  expression: 
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Figure  1.  General  geometry  for  (a)  the  single-lens  configuration,  (b)  the 
two-lens  configuration,  for  p<0;  o(x,y)  is  the  object  transparency,  (x,y)  the 
object  plane  and  (xpyf)  the  FRFT  plane. 


z 


2 


-P±^z. 

p  +  z, -/ 


(3) 


If  the  illumination  is  not  collimated,  the  scale  and  the  position  of  F(P)  change  in  general,  as 
compared  to  the  case  of  collimated  illumination.  However,  as  it  is  well  known,  this  is  not  the 
case  for  F(^),  the  perfect  conventional  Fourier  transform,  where  only  the  position  changes. 

Figure  1(b)  presents  the  general  geometry  of  the  two-lenses  configuration  for  p<0.  TTiis  case 
can  be  reduced  to  the  case  of  a  single-lens  geometry  with  convergent  illumination.  The  first 
lens  of  focal  distance  f^  determines  the  value  of  p;  the  second  lens  of  focal  distance  f  realizes 

the  FRFT.  The  equations  (l),  (2)  and  (3)  are  also  valid  for  the  two-lenses  configuration,  where 
p— Pofc/(po+fo)-  collimated  illumination,  pQ  is  infinite,  p=-fQ,  and  then 


(4) 


In  the  two-lenses  configuration,  if  both  lenses  have  the  same  focal  length  f,  then  Z2=0.  The 
focal  length  f^  and  pQ  determines  the  value  of  p,  and  thus  the  scale  factor  fp  and  Z2. 


The  most  general  expression  of  F^P^  [2]  for  both  configurations  is: 


where  (|)  and  fp  are  defined  by  equation  (l)  and  equation  (2),  respectively. 

The  inverse  FRFT,  f('P),  can  be  obtained  from  equation  (5)  after  substituting  there  /  by  -i. 
The  implementation  of  inverse  FRFT  is  realized  also  by  the  two  configurations,  substituting 
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the  positive  lenses  by  negative  lenses  and  interchanging  the  object  and  the  transformation 
planes  [3].  The  object  and  the  transform  distributions  are  in  those  cases  virtual  According  to 
equation  (1),  the  FElFTs  may  have  a  complex  degree  for  certain  values  of  zj. 

For  a  collimated  illumination,  the  geometries  in  both  configurations  take  the  form 
represented  in  figure  2(a),  where  Z2=zi,  and  in  figure  2(b),  where  Z2=0;  the  general 

expression  for  FOp)  (equation  (5))  takes  different  forms  in  the  two  configurations  [3]. 


- >j 
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o(x.y)  1 
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- > 
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Figure  2:  (a)  single-lens  configuration,  where  Z2=Z|,  (b)  two-lenses 
configuration,  where  Z2=0. 

An  important  parameter,  associated  to  any  F(P),  is  the  so  called  standard  focal  length,  fj  [1]. 
The  expressions  of  fj,  for  the  single-lens  configuration  and  for  the  two-lenses  configuration, 
are  respectively: 

/,  =  /smij)  .  /,  =  /  L-£^ 

sin(^  ^  ^ 

Transforms  F(p)  with  the  same  parameter  If^l  belong  to  the  same  family,  meaning  that,  when 
they  are  implemented  and  cascaded  in  tandem,  they  satisfy  the  properties  of  the  FRFTs 
expressed  by  the  relationship: 

=  F<«{F'“'{/(X,  j)}}  =  F<“*«{/(x,y)}  w 

The  meaning  of  the  standard  focal  length  f  j  becomes  clear.  It  represents  the  scale  factor  of 
FO),  resulting  from  the  application,  in  tandem,  of  other  FRFTs  belonging  to  the  same 
family. 

3.  Generalized  afocal  imaging  systems  and  correlators 

Perfect  imaging  optical  systems  or  correlators,  without  additional  phase  curvatures  at  the 
output  or  F(^)  planes,  can  be  realized  by  cascading  basic  units  (see  figure  2)  which 
implement  fractional  Fourier  transforms  of  the  same  family.  Particularly,  F^^+P)  may 
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represent  a  conventional  Fourier  transform  of  an  object  distribution  if  a+P=2n+l 
(n=0,l,2...),  an  inverted  image  if  a+p=4n+2,  or  an  erect  image  if  a+p=4n. 


Figure  3  shows  two  examples  of  imaging  systems  based  on  fractional  FRFTs  of  the  same 
family.  One  complex  unit  of  the  single-lens  configuration,  followed  by  another  complex  unit 
of  two-lens  configuration  are  used  in  figure  3(a);  three  real  units  of  the  two-lenses 
configuration  are  used  in  figure  3(b).  In  the  last  case,  it  happens  that  the  perfect  conventional 


Fourier  transform  F(1)  is  formed  in  plane  : 
considerations.  Therefore,  this  imaging  system  i 


This  can  be  understood  by  symmetry 
also  a  correlator  with  a  scale  factor  fi=f/2. 


pt2/3)  p(2/3)  p^ 


Figure  3:  (a)  perfect  imaging  system  with  two  units  of  complex  degrees, 
(b)  perfect  imaging  system  and  perfect  correlator;  o(x,y)  at  the  input  plane 
is  the  object  transmission  function. 


4.  Conclusion 

It  has  been  shown  that  it  is  possible  to  implement  imaging  systems  and  correlators  based  on 
FRFT  units  constituting  generalized  afocal  systems.  This  concept  may  simplify  the  design  of 
complex  systems. 
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Abstract.  In  the  paper,  an  optical  wavelet  transform  system  with  a  lenslet  array  as  well  as  its 
implementation  in  the  space  domain  is  presented.  Several  optical  experimental  results  have  been 
demonstrated  which  are  in  keeping  with  computer  simulations. 


1.  Introduction 

In  recent  years,  much  attention  has  been  paid  to  the  wavelet  transform  (WT)  for  it  shows  great 
potential  in  transient  signal  processing  [1-4].  WT  is  an  efficient  way  to  represent  a  signal 
(image)  in  both  time  (space)  and  frequency  domains.  On  the  other  hand,  the  WT  is  a  time 
consuming  process,  and  high  operation  speed  computational  tools  are  needed  to  implement  it. 
The  optical  implementation  of  the  WT,  taking  advantage  of  high  speed,  parallel  processing,  is 
desirable.  Some  optical  wavelet  transform  (OWT)  methods  in  which  WT  is  considered  as  a 
bank  of  filters  and  some  results  that  have  been  obtained  by  the  Fourier  transform  method 
through  spatial  filtering  in  the  frequency  domain  have  been  reported.  OWT  in  the  space 
domain  has  also  been  proposed  [5-6].  In  this  paper,  a  programmable  OWT  in  which  a  lenslet 
array  is  employed,  and  the  optical  Haar  wavelet  forms  is  realised  in  the  space  domain  for 
binary  images,  is  reported. 


^xyll 


^xylN 


^xyNI 


h 


xyNN 


Fig.  1  Space  anangement  of  wavelets 
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2,  2-D  Discrete  wavelet  transform  representation 

There  are  two  forms  for  Wavelet  Transform:  the  continuous  and  discrete  form.  In  the  case  of 
a  2-D  discrete  wavelet  transform  (DWT),  the  WT  expansion  corfficients  Wi^i  of  the  image 
s(x,y)  can  be  formed  as  follows: 


X  y 


\ 

y-yk 

a,  J 


(1) 


The  omission  of  the  ncxmalisation  factor  in  the  formula  does  not  effect  the  WT  op^ation  [7]. 

In  the  2-D  DWT,  the  coefficients  Wijki  form  a  4-D  matrix,  with  the  discrete  translation 
factor  (Xi.yk)  and  the  dilation  factor  (aj,ai).  The  WT  coefficients  with  a  given  dilation  factor 
can  be  represented  by  a  2-D  matrix  indexed  with  a  translation  factor  (Xj,yk).  In  our 
experiment,  the  results  of  WT  with  NxN  discrete  translation  factors  and  a  selected  dilation 
factor  are  obtained  in  parallel  on  a  2-D  output  plane.  The  coordinates  of  the  output  plane 
correspond  to  the  discrete  translation  factor  (xi,yk).  The  WT  for  different  dilation  factors  can 
be  implen^nted  step  by  step,  and  each  step  corresponds  to  a  WT  under  a  fixed  dilation  factor. 
Because  the  programmable  OWT  system  is  under  the  conu*ol  of  a  PC,  dilation  factors  can  he 
easily  selected. 


3.  Programmable  OWT  system 

For  the  fixed  dilation  factor  (aj,aO,  the  wavelets  are  represented  by  h^yik: 


(2) 


It  can  also  be  considered  as  a  4-D  matrix,  and  the  wavelet  matrix  (WM)  can  be 
arranged  with  a  2-D  array  (indexed  by  i,k)  of  2-D  submatrices  (indexed  by  x,y).  These 
wavelets  are  displayed  on  a  high  resolution  video  mcxiitor  addressed  by  a  CRT  monitor 
according  to  the  above  space  arrangement  (Fig.  1). 

The  experiment  uses  8x8  discrete  wavelets,  and  each  wavelet  consists  of  8x8  discrete 
values  which  can  be  calculated  from  Eq.  (2).  The  indexes  of  the  translation  factor  are  fixxn 
(1,1)  to  (8,8)  with  one  fixed  dilation  factor. 

In  the  experiment,  a  lenslet  array  consisting  of  8x8  lenses  is  to  be  used  to  implement 
the  optical  summation  of  Eq.  (1).  Each  lenslet  of  die  array  is  of  8.2  mm  diameter  and  385  mm 
focal  length.  Each  lens  images  a  ^ecific  wavelet  submatrix  onto  the  input  image  which  is  an 
SLM  with  8x8  binary  pixels.  The  images  of  the  wavelet  submatrices  can  be  added  to  the  input 
image,  and  the  transmitted  beam  from  the  SLM  is  collected  onto  the  output  plane  to  form  a 
8x8  pixel  array  which  represents  8x8  products  of  the  2-D  wavelet  submatrices  with  an  input 
image  (Fig.  2  and  Fig.  3).  Each  of  the  pixels  indexed  by  the  translation  factor  (xi,yk)  on  the 
output  plane  corresponds  to  a  WT  of  the  input  image  under  one  wavelet  indexed  by  the  same 
translation  factor  with  a  specific  dilation  factor. 
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Fig.  2  Optical  design  of  the  OWT  system 
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Fig.  4a  The  wavelet  mask  of  OWT 


•• 

• 

••••• 

• 

1 

#•« 

%  1 

Fig.  4b  OWT  of  the  input  image  “Da” 


The  wavelets  have  both  positive  and  negative  values,  there  are  several  common  ways 
to  process  the  negative  value  optically  [2].  In  the  experiment  we  choose  two  steps:  first,  all 
the  positive  values  of  the  wavelets  are  displayed  and  the  OWT  results  for  the  input  for  positive 
wavelets  are  available,  then  the  negative  wavelets  are  processed  in  the  same  way.  Finally,  the 
above  results  detected  by  a  CCD  are  subtracted  with  a  computer. 


4.  Optical  Haar  Wavelet  Transform 

In  the  Haar  Wavelet  Transform,  a  2-D  Haar  mother  wavelet  can  be  described  as  follows: 
h(x,y)=rect(x-0.5,y-0.5)+rect(x+0.5,y+0.5)-rect(x+0.5,y-0.5)-rect(x-0.5,y+0.5)  (3) 

We  choose  the  Haar  wavelet:  aj=ai=l,Xi,yk=l,2,...8.  The  input  binary  image  s(x,y)  also 
consists  of  8x8  discrete  pixels.  In  this  case,  substituting  Eq.  (3)  into  Eq,  (1),  we  can  obtain  the 
equation:  Wijki=s(i,k);  so  the  WT  coefficients  with  the  dilation  factor  can  be  denoted  by  Wik; 
therefore,  these  coefficients  of  selected  dilation  factors  can  form  a  2-D  matrix.  So  the  of 
an  image  has  the  same  form  as  the  input  image  for  the  dilation  factor:  aj=ai=l.  The  optical 
result  is  in  keeping  with  computer  simulation.  Although  the  WT  of  the  image  under  the 
dilation  factor  has  no  obvious  sense,  the  results  can  be  easily  tested  through  computer 
simulation,  and  provide  a  pragmatic  analysis  of  confirmation  of  the  optical  results,  especially 
for  a  processed  image  with  a  few  pixels. 
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The  Chinese  character  “Da”  is  an  input  binary  image  for  the  experiment,  and  the 
character  makes  up  8x8  dots.  The  discrete  binary  image,  Haar  wavelets  with  the  dilation 
factor  (aj=l,ai=l)  have  no  negative  values,  and  the  result  can  be  obtained  from  Eq.  (1).  The 
wavelet  mask  displayed  on  a  video  mcmitor  is  shown  in  Fri,  4a,  and  the  OWT  of  the  input 
image  “Da”  is  shown  in  Fig.  4b.  This  result  is  in  keeping  with  computer  simulations  (Fig.  6). 


5.  Conclusion 

We  proposed  an  architecture  of  an  OWT  in  the  space  domain.  The  optical  Haar  wavelet 
transform  is  implemented  with  the  method.  However,  we  need  a  great  deal  of  enginemng 
techniques  to  design  a  lenslet  array  consisting  of  a  large  number  of  lenses  to  implement  a 
DWT  with  fine  dilation  and  translation  factors  for  images  made  up  of  many  pixels.  Moreover, 
we  also  need  an  SLM  with  a  large  dynamic  range  of  grey  levels  to  represent  wavelets  with 
multiple  values. 
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Abstract;  The  frequency  response  of  Low  Resolution  Fresnel 
Encoded  Lenses  is  calculated  in  terms  of  all  parameters  that 
characterizes  these  lenses  and  possible  applications  to  pattern 
recognition  are  discussed. 


1.  PSF  of  Low  Resolution  Fresnel  Encoded  Lenses 


When  a  Fresnel  Lens  with  focal  length  f  for  a  wavelength  X  is  encoded  in  a 
pixelated  low  resolution  device,  with  a  center  to  center  pixel  distance  given  by 
AX,  Ay,  infinite  new  focal  regions  appear  at  the  coordinates  (kX,lY)  where  X 
=  Xf/Ax  and  Y  =  Xf/Ay  where  k,l  are  arbitrary 
integers[l].  A  low  resolution  fresnel  encoded  lens 
(LRFEL)  seems  an  array  of  lenses  with  a  size 
given  by  XY.  Then,  if  the  device  has  NxM 
pixels  with  a  rectangular  pupil  of  dimensions  L^ 

=  Nax  and  Ly  =  MAy,  the  number  of  lenses 
appearing  in  the  device  is  given  by  =  L^/X 
and  Wy=Ly/Y.  In  Fig.l  we  can  see  a  LRFEL 
with  W,,  =  Wy  =  3. 

The  amplitude  distribution  at  a  k,l  focus  for 

a  plane  wave  illumination  is  given  by  (Eq.  20  of 
Ref  [1])*  ‘  w«=w^=3. 
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iXf  AxAy| 


slnci^,-^)expli^(x.kx*y.ly)] 


*rect{ 


Ax''  Ay 


7)  (1) 


The  sine  function  comes  from  the  Fourier  transform  of  the  rectangular  pupil  of 
the  device  and  the  rect  function  defines  the  pixel  of  dimensions  ax' Ay'. 
Another  useful  parameters  are  Cx=Ax'/Ax  and  Cy=Ay’/Ay. 


2.  Coherent  transfer  function  for  a  0^,1)  focus 

In  order  to  calculate  the  frequency  response  we  must  perform  the  Fourier 
transform  of  Eq.l. Taking  as  a  natural  unity  of  length  the  pixel  size  we  obtain: 


^k.l  ^3x>  9y) 


)  sine {c^.g^,Cy.  ffy) 

fVy 


(2) 


For  a  (k,l)  focus,  the  rect  function  is  shifted  k  in  the  direction  and  1 
in  the  g,  direction.  In  a  general  case  (any  shape  of  the  pupil  or  the  pixel)  it  will 
happen  the  same,  we  will  have  the  integral  of  the  Fourier  transform  of  the  pixel 
function  multiplied  by  the  shifted  pupil  function.  Now,  let  us  study  some 
different  cases: 

First  we  consider  (k=l=0).  In  this  case  the  cut-off  frequency  is 
determined  by  the  rect  function  at  the  frequency  (in  pixels  ')  g£,,=  W^/2,  g^y= 
Wy/2.  The  first  zero  for  W„  >  2/c^  or  W,  >  2/Cy  of  H  is  given  by  go,x=l/c» 
go,,=  l/c,.  If  <  2/c„  or  W,  <  2/c,  then  go,,  =  gc,x  or  go.^  =  g,,y.  For 
smaller  values  of  the  focal  length  H  take  negative  values  and  inversions  of 
contrast  may  happen.  For  c,->0  and  Cy->0  we  obtain  H  corresponding  to  the 
infinite  resolution  case  (with  a  diffraction  efficiency  tending  also  to  zero).  Then 
the  limitations  imposed  for  the  low  resolution  condition  is  a  function  of  c„  the 
ratio  between  the  pixel  size  and  the  center  to  center  pixel  distance. 

We  can  now  make  general  considerations  for  different  k,l.  When  c,W,/2, 
CyW/2  >  >  1  then  the  rect  function  is  large  compared  with  the  sine  function 
and  his  effect  will  not  be  appreciable  for  k<W,/2-l/c,  and  l<Wy/2-l/Cy.  In 
that  conditions  each  focal  region  verifying  the  inequalities  will  replicate  near  the 
same  information.  This  situation  is  sketched  in  Fig  2  for  W,=Wy=40  and 
Cj=Cy  =  l.  Note  that  the  inequalities  can  be  verified  for  the  particular  case  of 
a  short  focal  lens  with  c„Cy-l,  or  a  large  focal  length  with  c„Cy  <  <  1.  When 
c,W,/2,  CyWy/2  <  <  1,  then  the  rect  function  is  thinner  than  the  sine  function. 
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for  k,l>0  H  has  the  aspect  of  Fig.  2  The 
H  for  each  k,l  corresponds  only  to  a  little 
window  of  frequencies.  In  such  case  each 
focus  will  concentrate  information 
corresponding  to  a  certain  range  of 
frequencies.  If  W^,  Wy  =  1  then  the 
windows  are  the  ones  sketched  in  Fig  3 
In  this  case  each  focus  replicates  different 
information  but  all  information  is 
replicated. 


Fig.2  H  for  =  40  and  c, 


3.  Application  to  pattern  recognition  |  | 

s  -  /  I  I  \  I 

We  have  seen  that  in  certain  conditions  -  /  |  |  \ ; 

each  focus  can  replicate  different  '  i  I  'I 

information  of  the  object.  The  central  •«-  k*-*  |  ^-o  j  k-i  : 

focus  can  replicate  the  information  I  1  ! 

corresponding  to  a  window  of  low  '“H  ^  \  ■  i  ^  i  ^  ^  '  i  '  i 

frequencies  and  the  others  ones  can  ^  ^  .  a  o 

replicate  windows  corresponding  to  ^ 

high  frequencies.  This  will  permit  to  (I^^shed)  and  0.2  (Solid) 

compare  separately  the  information  corresponding  to  high  and  low  frequencies 
and  different  kind  of  filters  can  be  used  in  order  to  process  appropriately  this 
information. 

We  have  also  seen  that  for  short  focal  length  encoded  each  focus 
replicates  near  the  same  information.  This  opens  the  door  to  apply  different 
types  of  filters  for  the  same  image  in  order  to  improve  the  recognition  process. 
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Abstract  We  propose  a  Phase  only  filter  with  maximum  discrimination  capability,  A  zero  modulation 
state  is  introduced  to  block  certain  spatial  frequencies  to  equalize  the  phase  difference  histogram.  The 
discrimination  capability  is  increased  significantly  compared  with  that  of  the  POF. 


1.  Introduction 

The  performance  of  different  spatial  filters  with  high  discrimination  capability  (DC)  has  been 
investigated  in  recent  years.  Several  types  of  filters  have  been  proposed  to  increase  this  parameter 
(DC):  Phase  Only  Filter  (POF)[l],  Optimal  Filter[2]  and  other  techniques  [3],  The  POF  has  a 
better  discrimination  than  the  classical  matched  filter  and  is  more  sensitive  to  noise.  Hence,  in 
[4]  B.V.K.V.  Kumar  and  Z.  Bahri  introduced  the  notion  of  optimal  support  function  for  a  Phase 
Only  Filter  and  they  defined  the  Optimal  Phase  Only  Filter  (OPOF)  in  the  sense  of  maximizing 
the  signal  to  noise  ratio  SNR..  The  support  function  indicates  which  pixels  in  the  filter  have 
magnitudes  equal  to  1  and  which  pixels  have  zero  magnitudes.  Other  support  functions  to 
optimize  different  criteria  or  multicriteria  have  been  proposed  [5].  In  all  of  these  designs  the 
support  functions  are  calculated  by  taking  into  account  the  amplitude  distribution.  The  phase 
distribution  plays  a  crucial  role  in  the  discrimination  capability  and,  in  consequence,  we  propose 
a  support  function  to  optimize  the  DC  based  on  the  phase  information. 

In  this  contribution  we  propose  a  method  that  optimizes  the  discrimination  capability  by 
using  a  POF  with  a  support  function.  The  procedure  is  based  on  the  modification  of  the  phase 
difference  histogram  by  blocking  some  frequencies.  We  investigate  the  DC  and  we  show  the 
improvement  in  DC  obtained  with  this  method.  Simulated  and  experimental  results  are  presented. 

2.  Theoretical  analysis 

We  denote  by  t(x,y)  and  d(x,y)  the  target  and  nontarget  respectively,  when  the  target  t(x,y)  is 
present  at  the  origin,  and  the  nontarget  d(x,y)  is  placed  in  the  scene  at  coordinates  that  the 
maximum  of  the  cross-correlation  is  at  the  origin.  Let  T(u,v)  =  |T(u,v)|  exp[i<J)t(u,v)]  and  D(u,v) 
=  |D(u,v)|  exp[i(|)j(u,v)]  be  their  Fourier  transforms.  When  the  POF  is  used  the  auto  and  the  cross¬ 
correlation  functions  in  Fourier  domain  are  given  by; 

Cffi^v)=\T(u,v)\ 

C>.v)  =|D(M,v)|exp[/A(j3(M,v)] 


(1) 
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where  A4)(u,v)  =  4>d(u,v)  -  4)t(u,v)  is  the  phase  difference.  The  discrimination  capability,  DC,  is 
a  parameter  that  measures  the  ability  of  the  correlator  to  discriminate  between  two  very  similar 
input  oh/ects,  and  it  is  defined  as: 


1  IXu,v)  I  exp  [/A(t)(w,v)] 

u,v  _ _ _ 


(2) 


Let  us  divide  the  2n  phase  interval  in  N  steps  of  width  64)-  27i/N.  Let  Sj  be  the  set  of 
pixels  (u,v)  for  which  the  phase  difference  A 4)  is  in  the  interval  A4)j.i  <  A4>  <  A4)j,  where  A4>j 
=  j.64).  Let  us  define  the  weighted  phase  difference  histogram  as: 

P  =  5:  IZXMI  (3) 

(u,v)€Sj 

We  approximate  the  phase  difference  of  those  pixels  which  belong  to  S,  by  A4)j,  then,  the  DC  can 
be  written  as: 


DC=l 


E  [  ^N/2y  ]  ^ 


V/2 


Er 


/=i 


(4) 


The  DC  is  optimized  if  we  equalize  P-  for  J  =l..  N/2.  This  can  be  made  by 

blocking  some  frequencies.  After  the  blocking  process  the  phase  histogram  P^  is  equal  to 
and,  in  concequence  the  DC  given  by  equation  (4)  is  equal  to  1.  In  practice  it  is  not  neccesary  to 
make  Pj-  PN/2+rO-  It  is  enough  that  this  difference  be  |Pj-  PN/2+jK  e.  then  DC  <N.e^/2.E„t„.  where 
is  the  nonblocked  energy.  The  DC  is  improved  for  small  value  of  e  . 

3.  Computer  simulation  and  experimental  results 

We  invesigate  the  comparative  performance  of  conventional  POP  and  optimized  POP  in  terms 
of  discrimination  capability  between  two  similar  objects  .  To  illustrate  the  improvements  of  the 
discrimination  with  the  proposed  method,  we  perform  numerical  expriments  using  alphanumeric 
characters.  Pigure  1  shows  the  input  scene,  the  target  is  the  letter  F,  and  the  ob  ect  to  be  re  ected 
is  the  letter  E  (letter  F  is  contained  in  letter  E).  In  figure  2a  and  2b,  we  show  the  original  and  the 
modified  weighted  phase  difference  histograms,  with  N  equal  to  100.  The  corresponding 
correlations  are  shown  in  figure  3a  and  3b.  By  comparing  these  figures  it  is  shown  that  the 
algorithm  proposed  in  this  paper,  enhances  clearly  the  discrimination  capability  of  the  POP. 

In  the  optical  experiment  we  used  the  scene  of  figure  1.  The  scene  and  the  filter  were 
obtained  with  the  commercial  polygraphic  machine  Linotronic  630  with  a  resolution  of  3251  dpi. 
The  filters  are  codified  by  Burckhardfs  method  [6]  in  a  size  of  9X9  mm..  The  light  source  is  a 
15  mW  linearly  polarized  He-Ne  Laser. 
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Fig.l.  Input  scene. 
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Fig.  2a.  The  original  Weighted 
Phase  Difference  Histogram. 
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Fig.  2b.  The  modified  Weighted 
Phase  Difference  Histogram. 


The  optical  correlation  is  recorded  by  a  Pulnix  CCD  detector  array  and  stored  in  a  computer  by 
means  of  a  8-bit  frame  grabber.  The  experimental  results  with  the  conventional  POF  and  with  the 
optimized  POF  to  improve  DC,  are  illustrated  in  Fig.  4.  Fig.  (4a)  shows  a  3D  plot  of  the 
correlation  plane  when  the  conventional  POF  is  used.  The  DC  is  about  0.12.  Fig.  (4b)  represents 
a  3D  plot  of  optical  correlation  with  the  optimized  POF  for  e-300.  The  simulation  and  optical 
results  are  given  in  the  table  1  for  differente  values  of  e. 

Table  1;  Discrimination  capability  (DC)  of  optimized  POF  for  different  values  of  e. 


POF 

e=300 

e=  400 

e-  600 

6=  1000 

Simulation 

0.12 

0.91 

0.88 

0.81 

0.71 

Optical 

experiment 

0.14 

0.89 

0.90 

0.87 

0.74 
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Fig.  3.  Correlation  planes  with  computer  simulation,  a)  With  conventional  POF.  b)  With 
optimized  POF  according  with  DC. 


Fig.  4.  Correlation  planes  (experimental  results),  a)  With  conventional  POF.  b)  With  optimized 
POF  according  with  DC. 


4.  Summary. 

A  method  to  improve  the  DC  of  POF  is  given.  It  consist  in  blocking  some  frequencies.  The  DC 
increases  when  e  descreases.  The  experimental  results  are  in  accordance  with  the  computer 
simulated  ones.  The  sharpness  of  the  autocorrelation  peaks  is  improved  with  optimized  POF. 
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Abstract:  The  correlation  performance  of  SDF’s  is  improved  using  a  filter  mod¬ 
ulator  to  enhance  higher  frequencies  from  individual  training  images.  A  second 
operator  containing  all  device  dependencies  allows  a  hardware  specific  optimal  filter 
to  be  derived. 


1  Introduction 

Although  filter  synthetic  discriminant  functions  (fSDFs)^^’^]  are  capable  of  producing  dis¬ 
tortion  torelant  filters,  they  suffer  from  poor  dynamic  range  when  implemented  on  spatial 
light  modulators  (SLMs)  capable  of  encoding  multi-level  values.  This  is  due  to  most  of  the 
energy  being  concentrated  in  a  very  narrow  low  frequency  band  leaving  the  mid-band  area 
with  energy  levels  below  the  smallest  modulation  level  on  an  SLM.  In  order  to  solve  this 
problem,  Wang  and  Chatwini^’'*^  investigated  the  utility  of  modulating  individual  training 
set  images  and  linearly  combining  them  to  produce  an  SDF  which  itself  was  then  modu¬ 
lated;  this  synthesised  filter  is  called  the  modified  fSDF  (MfSDF).  The  MfSDF  presented 
here  synthesizes  the  SDF  from  the  linear  combination  of  a  set  of  training  images  which 
are  already  filter  modulated  so  that  the  SDF  constructed  is  dominated  by  the  higher,  not 
the  lower,  frecjnency  components  of  individual  training  set  images.  The  filter-encoding 
constraint  is  then  applied  to  the  SDF.  This  scheme  enables  multi-grey  level  SLM  capabil¬ 
ities  to  be  fully  utilised.  The  performance  of  this  scheme  is  compared  v/ith  that  possible 
using  a  binary  SLM. 


2  MfSDF  Filter  and  its  Design 

The  MfSDF  extends  the  tSDF  filter  designl’’^!  to  make  it  more  flexible.  It  begins  with  a 
set  of  centered  training  set  images  in{x,y),  n  —  0,1,...,/:,  spanning  the  desired  distortion 
invariant  feature  range.  The  modulation  operator,  Af,  is  first  imposed  on  the  individual 
training  image  set  to  establish  the  pre-processed  training  set  images  F^{.'r,y): 

i'„(x,y)  =  T-'Arr[t„(x,y)]  (1) 

where  J-  is  the  Fourier  transform  operator.  The  pre-processed  training  image  set  is  used 
to  construct  the  MfSDF,  s'{x,y),  for  a  selected  filter  modulation,  M.  The  desired  peak 
correlation  response  of  s'[x,y)  is  a  constant,  for  each  training  image  tn[x,y): 


j  J  L{x,y)s'"{x,y)dxdy  =  ln(x,y)  O  s'{x,y)  =  c, 


(2) 
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where  the  integral  is  taken  over  the  area  of  the  input  field.  The  function  s'{x,y)  includes 
the  filter  modulation,  M,  through  the  equation: 

s'(x,y)  =  J=-^M:Fls(x,y)j  (3) 

The  purpose  of  the  MfSDF  procedure  is  to  determine  the  function  s(x,y)  which  solves 
Eq(2)  for  a  specified  modulation  function,  AA.  The  function  s(x,  y)  is  chosen  to  be  a  linear 
combination  of  the  pre-processed  training  set  images.  Thus,  a  general  MfSDF  synthesis 
equation  is  derived  as: 

0  a„X-^Ary^[t„(x, ;/)]}  =  c„  (4) 

n=;0 

Figl  gives  a  simple  flow  chart  of  this  MfSDF  filter  design  procedure. 


Figi .  Flow  chart  of  the  MfbDF  design  procedure 


The  modulation  operators,  and  A^,  can  be  specified  to  take  on  any  desired  form. 
Whether  or  not  the  MfSDF  produces  equal  correlation  peaks  (ECP)  for  all  the  training 
set  images  depends  on  the  choice  of  the  modulation  operators  yVf  and  A/".  Thus,  Eq(4) 
is  frequently  a  system  of  nonlinear  equations  which  may  be  solved  using  an  iterative 
procedure^^^  based  on  the  Newton- Raphson  algorithm. 


3  Simulation  and  Results 

The  SDF  function  s(x,y)  was  constructed  using  POF  modulation  of  the  individual  train¬ 
ing  set  images.  The  modulation  operator  Ad  was  specified  to  give  phase  and  amplitude 
modulation  within  the  limitations  of  the  LCTV,  as  this  may  be  implemented  in  real  time 
on  the  commercially  available  Epson  LCTV  SLMs.  Consistent  with  the  Epson  SLM  per¬ 
formance,  the  number,  N,  of  discrete  levels  of  phase  and  amplitude  (MLAP)  information 
was  chosen  to  be  16.  The  binary  phase  only  filter  (BPOF)  modulation  in  consistent 
with  using  a  binary  SLM  which  are  commercially  available;  hence,  the  MLAP  scheme  is 
compared  with  that  possible  using  a  binary  SLM. 

The  following  filter  performances  are  considered:  distortion  invariant  range,  discrim¬ 
ination  sensitivity  between  in-class  and  out-of-class  images,  and  ability  to  accommodate 
the  input  image  noise.  The  training  set  images  used  in  the  simulations  consist  of  in-plane 
rotated  images  of  the  in-class  Bradley  APC  vehicle  and  out-of-class  Abram  MI  tank.  Each 
image  is  centered  and  normalised  to  unit  energy. 

Filters  designed  to  be  invariant  to  in-plane  rotation,  over  distortion  ranges  up  to 
180C  are  constructed  using  training  images  of  the  in-class  APC  vehicle  and  out-of-class 
MI  tank  separatc'd  by  a  rotation  increment  of  5*'.  The  correlation  peaks  are  specified  to 
give  a  constant  value  of  e  for  the  in-class  training  set  images  and  zero  for  the  out-of-class 
training  set  images  at  th(‘  centc'r  position  of  output  plane.  For  example,  the  MfSDF 
designed  for  invariance  to  in-pianc  rotation  over  a  distortion  range  of  45^  is  constructed 
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from  ten  in-class  and  ten  out-of-class  training  set  images.  After  construction,  filters  are 
correlated  with  images,  spanning  their  entire  design  range  of  45°,  at  1°  intervals. 

The  PSRyj  data,  which  is  usually  used  to  evaluate  the  distortion  invariant  range  of 
the  filter,  for  the  MLAP/MfSDF  and  BPOF/MfSDF  with  distortion  ranges  up  to  130° 
are  displayed  by  Fig2.  From  Fig2,  it  can  be  seen  that  the  value  of  PSRyj  drops  below  1 
for  a  distortion  range  of  approximately  65°  ~  70°  for  the  BPOF/MfSDF;  whereas,  for  the 
MLAP/MfSDF  it  happens  at  120°  ~  125°,  almost  twice  that  of  the  BPOF/MfSDF. 


Fig2.  The  worst  correlation  peak  response  to  Fig3.  The  worst  discrimination  capability 
secondary  peak  response  ratio  (PSRw)  over  of  filters  between  in-class  and  out-of-class  im- 
the  distortion  ranges  from  0°  to  130°.  ages  over  the  distortion  ranges  up  to  130°. 

Fig3  shows  the  worst-case  discrimination  values  DCy^  of  MLAP/MfSDF  and  BPOF/- 
MfSDF  measured  at  any  angle  within  each  distortion  range  up  to  130°.  It  can  be  seen 
from  Fig3  that,  for  the  MLAP/MfSDF,  the  value  of  drops  below  one  for  a  distortion 
range  of  120°  ~  125°;  hence,  the  MLAP/MfSDF  can  be  designed  for  an  invariant  distortion 
range  of  at  least  120°  whilst  still  maintaining  100%  discrimination  against  the  out-of-class 
targets  with  training  images  spaced  at  5°.  Whereas,  for  the  BPOF/MfSDF,  the  DC^ 
value  drops  below  one  at  a  60°  distortion  range,  just  half  the  value  achieved  by  the 
MLAP/MfSDF  method.  It  should  be  noted  that  although  the  BPOF/MfSDF  can  achieve 
the  invariant  distortion  range  of  65°  from  the  conclusion  of  the  previous  paragraph,  it 
does  not  guarantee  a  100%  discrimination  capability  between  the  in-class  and  out-of-class 
images;  thus  its  effective  invariant  distortion  range  is  60°.  This  is  an  issue  frequently  not 
addressed  by  other  authors. 

The  filter  noise  resistance  performance  is  examined  by  using  in-plane  rotated  images 
of  the  noise  corrupted  in-class  Bradley  APC  vehicle  and  out-of-class  Abram  MI  tank;  the 
ratio  of  image  energy  to  noise  energy  is  equal  to  0.5  which  means  that  the  images  are 
severely  corrupted  by  noise. 

Fig4a  shows  the  in-class  and  out-of-class  peak  correlation  responses,  using  the  noise 
corrupted  in-class  and  out-of-class  images  as  the  input  images,  for  a  MLAP/MfSDF  con¬ 
structed  from  the  noise  free  in-plane  rotated  training  images  with  a  distortion  range  from 
0°  to  40°.  The  training  images  used  are  5°  apart.  A  similar  gragh  for  the  BPOF/MfSDF 
is  shown  in  Fig4b.  From  Fig4,  it  is  clear  that  the  MLAP/MfSDF  filter  is  invariant  to  the 
distortion  range  of  40°  whilst  delivering  a  superior  discrimination  capability  (the  value  of 
DC  >1.6  everywhere)  between  the  in-class  and  out-of-class  images  in  this  noise  corrupted 
case;  whereas,  the  BPOF/MfSDF  does  not  give  complete  invariant  distortion  over  this 
range  as  several  secondary  correlation  peaks  and  out-of-class  correlation  peaks  exceed  the 
lowest  in-class  correlation  peaks. 

The  PSRyj  data  for  the  MLAP/MfSDF  and  BPOF/MfSDF,  with  distortion  ranges 
up  to  60°,  are  displayed  by  Fig5.  The  resulting  data  obtained  are  similar  to  those  in 
Fig2  and  Fig3  except  that  the  input  images  used  are  the  noise  corrupted  images.  From 
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Fig4.  In-class  Correlation  peak  (ICCP),  secondary  peak  (ICSCP)  and  out-of-class 
correlation  peak  (OCCP)  responses  with  a  distortion  range  from  0*^  to  40°;  (a) 
A4LAP/MfSDF  and  (b)  BPOF/MfSDF. 


Fig5,  for  the  noise  corrupted  image  inputs,  the  MLAP/MfSDF  still  delivers  distortion 
invariance,  with  a  100%  discrimination  capabiltiy  between  the  in-class  and  out-of-class 
images,  up  to  at  least  45°;  whereas,  the  BPOF/^I^SDF  achieves  less  than  15°. 


Fig5.  The  worst  correlation  peak  response  to  secondary  peak  response  ratio  (PSRtjj) 
and  the  worst  discrimination  capability  {DC^j)  of  filter  between  in-class  and  oiit-of- 
class  images  over  the  distortion  ranges  from  0°  to  60°:  (a)  MLAP/MfSDF  and  (b) 
BPOF/MfSDF. 

4  Conclusions 


Using  16  levels  of  modulation,  a  modulator  operator  has  been  formulated  to  represent 
the  SEIKO-Epson  SLM.  Phase  only  modulation  on  the  individual  training  images  was 
selected  to  allow  an  initial  investigation  of  the  MfSDF  design,  it  is  not  the  optimal  choice. 
The  results  show  that  the  MLAP/MfSDF  filter  offers  much  improved  correlator  system 
performance  with  a  greater  allowable  image  distortion  range  whilst  maintaining  100% 
discrimination  between  in-class  and  out-of-class  images;  furthermore,  it  shows  an  improved 
ability  to  accommodate  input  image  noise  when  compared  to  the  MfSDF  with  a  binary 
phase  only  constraint. 
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Abstract.  A  parallel  projection  method  is  introduced  which  solves  short¬ 
comings  of  conventional  projection  methods  and  alleviates  many  of  their 
limitations.  This  process  can  be  efficiently  employed  for  image  processing, 
pattern  recognition,  optimization  and  many  other  signal  processing  tasks. 


1.  Background  and  Development 

With  the  increasing  demands  and  complexity  of  signal  processing  systems,  renewed  in¬ 
terest  emerged  in  the  set  theoretic  formulations  [1]  applied  to  optimization  problems  such 
as  signal  synthesis  in  pattern  recognition,  constrained  deconvolution,  image  restoration 
etc.  The  task  is  usually:  “Given  some  a  priori,  corrupted,  signal,  h°,  which  is  known 
to  have,  supposedly,  satisfied  N  constraint  sets,  try  to  restore  it  to  a  signal  which  will 
satisfy  the  constraint  sets”.  A  similar  framework  appears  in  design  problems. 

Projection  methods  have  been  suggested  for  this  task.  A  projection  of  an  arbitrary 
element,  h,  onto  a  closed  convex  set  <7,-  is  that  unique  element  in  the  set  ,h',  closest  to 
h,  where  “close”  is  measured  by  some  distance  function  di.  Namely, 

Pc  W  =  h'  if  and  only  if  inf  di{hi,h)  =  di{h',h);  h'  G  Ci. 

Given  N  sets  Ci  one  usually  constructs  operators  of  the  form  '^k{i)Pc\ 

where  Wk(i)  is  some,  iteration  dependent,  averaging  function.  This  leads  to  a  sequence 
One  strives  to  show  that  the  sequence  generated  converges  to  a  solution 
satisfying  aU  constraint  sets,  i.e.,  that  lim*_c»  G  Cq  fl^i  The  results  known 
to  date,  basically,  amount  to  the  following:  if  ^  0  and  di  are  equal  for  all  sets 
involved  and  aU  sets  Ci  are  convex  then,  under  some  general  conditions,  the  sequence 
{h°,  converges  (weakly)  to  an  element  in  Co- 
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However,  what  happens  if  the  constraints  are  inconsistent,  i.e.,  Co  =  ^  (which  is 
a  frequent  occurrence  due  to  detection  error,  slight  mis- characterization  of  the  set,  too 
optimistics  requests  etc.)-see  [2]?  Is  there  convergence  at  all  and  if  so  to  what?  Also,  how 
wiU  one  project  onto  a  constraint  set  when  the  constraint  is  given  indirectly/impHcitly, 
as  some  pre-image  of  another  explicit  set?  This  may  be  transformed  in  many  instances 
to  an  easier  projection  onto  an  explicit  set  but  then  the  distance  function  is  modified 
[3],  what  then?  Finally,  in  many  image  recovery  problems  more  than  2  constraints  exist, 
where  not  all  are  convex:-  can  a  projection  based  algorithm  be  employed  (classically  we 
are  limited  to  two  sets  [4])? 

Fortunately,  aU  of  the  above  problems  may  be  solved  by  use  of  a  special  parallel 
projection  method,  based  on  [5].  It  amounts  to  performing  projections  of  the  current 
estimate  onto  all  N  sets  involved  and  then  taking  a  special  average  of  these  projections. 


2.  Algorithm 


The  special  parallel  projection  algorithm  assumes  the  following  form: 

Initialization:  h°[x)  is  an  arbitrary  function. 

Iterative  step:  given  the  function  h^{x)^  calculate, 

(la) 

(lb) 


i;H'(a:)  :=  for  alH  -  1,  2,  •  •  •  A, 


and  then,  set 


I  Er=l  i ' 


where  Wi  are,  frequency  dependent,  weighting  functions  in  accordance  with  (2),  see 
[2,  6,  7].  T  and  7^"^  denote  the  Fourier  Transform  and  its  inverse,  respectively, 
are  user  determined  positive  real  numbers  and  upper  case  letters  denote  the  Fourier 
transform  of  the  lower  case  functions,  e.g.,  hi(u)  :=  jF{u,(2:)}. 


3.  Characteristics  of  the  algorithm 

The  algorithm  has  the  following,  desirable,  characteristics: 

1.  Even  if  the  sets  are  nonconvex  or  Co  is  empty,  the  appropriate  Summed 
(squared)  distance  error  (SDE)  decreases  along  the  iterates,  for  any,  finite, 
number  of  closed  constraint  sets,  where  the  appropriate  SDE,  J,  is  defined  by 

j(h)  =  and  d'‘{huh2)  =  j  Wi{u)\Hi{u)  -  H4u)\^du. 

(2) 

We  note  that  are  just  weighted  norm,  with  weight  Wi. 

2.  If  Co  is  nonempty,  the  algorithm  converges  to  Co-  Even  if  Co  is  empty,  the  al¬ 
gorithm  converges  under  general  conditions  [2]  (provided  all  sets  Ci  are  closed 
and  convex).  Moreover,  it  converges  to  one  of  the  best  solutions  possible. 
Namely,  one  minimizing  its  distances  (squared)  from  all  sets  involved,  i.e.,  to 
an  element  minimizing  globally  J. 
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3.  Projections  onto  the  individual  sets  may  be  performed  w.r.t  different  distance 
functions.  Thus,  many  implicit  constraints  may  be  converted  to  explicit  con¬ 
straints  with  modified  distance  functions,  which  are  handled  efficiently  by  this 
method  [2,  6]. 


4.  Examples 

The  present  algorithm  has  already  been  employed  successfully  for  pattern  recognition 
purposes  [6].  As  an  illustrative  example  to  demonstrate  the  power  of  the  algorithm 
we  employ  it  to  image  restoration  with  inconsistent  constraints.  We  asssume  a  linear 
degradation  model  of  the  form  g  =  f  *  h  +  n  where  /  is  the  original  image,  h  is  the 
blurring  function,  n  denotes  the  observation  noise  and  g  is  the  recorded,  blurred  and  noisy 
observation.  The  task  is  to  restore  the  original  image  /  from  its  degraded  observation  p, 
where  we  assume  that  h  is  given  as  weU  as  the  statistics  of  the  noise. 

In  our  restoration  example  we  use  the  following  constraints: 

1.  The  estimated  image  convolved  with  the  blurring  function  yields  a  distribution 
which  is  close  to  the  observed  distribution,  g^  both  in  the  spatial  domain  and  in 
the  frequency  domain. 

2.  The  estimated  image  distribution  should  be  close  to  the  Wiener  solution  which  is 
known  to  be  optimal  in  certain  circumstances  (in  the  MSE  sense). 

3.  The  estimated  image  should  be  real  and  positive  with  finite  support,  say 
[a,b],  a<b. 

Any  image,  /,  which  satisfies  all  of  the  above  four  constraints  is  considered  a  solution. 
For  full  details  and  discussion  see  [7]. 

We  consider  a  tank  (of  size  70  x  70  pixels)  blurred  by  a  blurring  function  3x3 
with  additive  white  Gaussian  noise.  Our  initial  estimate  (/o),  for  initialization  of  our 
algorithm,  is  a  constant  background,  space  limited  to  [4, 67]  in  both  dimensions.  Shown 
in  Fig.  1  are  the  intensities  of  the  original  image,  blurred  and  noisy  image,  the  Wiener 
solution  (/u,)  and  the  restoration  after  10  iterations,  respectively.  The  improvement  of 
our  iterative  approach  over  the  Wiener  result  is  quite  evident.  To  quantify  the  results 
the  mean  square  error  (MSE)  was  calculated  for  reconstruction  by  the  present  algorithm 
and  the  Wiener  solution.  After  10  iterations  with  our  algorithm  the  MSE  was  less  than 
that  of  the  Wiener  solution  by  a  factor  of  79.  This  is,  seemingly,  surprising  as  the  Wiener 
solution  is  the  best  result  obtained  by  a  linear  filter,  where  the  blurring  mechanism  is 
linear,  and  optimality  is  judged  by  the  MSE  criterion.  Nevertheless,  our  restoration 
algorithm  yields  better  results  since  the  Wiener  filtering  approach  does  not  exploit  all 
the  information  (constraints)  available. 
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Figure  1.  The  restoration  results.  Clockwise  from  upper  left:  The  original 
image,  the  degraded  blurred  and  noisy  image,  the  Wiener  solution  and  the 
result  of  our  Algoriothm  after  10  iterations. 
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Abstract.  We  review  the  methods  and  possible  applications  of  recording,  reconstruction  and 
transformation  of  optical  time  signals,  while  using  interference,  diffraction  and  nonlinear  in¬ 
teraction  of  waves  obtained  by  spatial  spectral  decomposition.  The  experimental  results  on 
time-to-space  conversion  of  nanosecond  signals  using  spectral  nonlinear  optics  are  con¬ 
sidered. 


Consider  spatial  spectral  decomposition  of  a  short  optical  pulse.  The  diagram  of  the  opti¬ 
cal  spectral  instrument  used  for  this  purpose  is  shown  in  the  fig.  1.  In  this  diagram  G  is 
the  dispersive  optical  element  that  carries  out  the  angular  spectral  decomposition  of 
light.  In  our  example  G  is  the  transmitting  diffraction  grating.  The  dispersive  element  is 
placed  in  the  front  focal  plane  F  of  the  lens  L.  The  spectrum  plane  of  the  instrument  S 
coincides  with  the  back  focal  plane  of  the  lens  L. 


Fig.  1. 

Let  a  short  wave  packet  WP  be  incident  on  the  diffraction  grating  G.  The  radiation 
of  the  wave  packet  scatters  from  the  rules  of  the  grating.  We  may  consider  that  the  mov¬ 
ing  wave  packet  gives  rise  to  a  small  source  of  monochromatic  radiation  that  moves  over 
the  plane  of  the  grating.  The  radiation  of  a  small  monochromatic  source  moving  along 
the  front  focal  plane  is  converted  by  the  lens  L  into  the  parallel  beam.  This  beam  rotates 
around  the  back  focal  point  O  of  the  lens.  Irrespective  of  the  beam  rotation  the  velocity 
of  wave  propagation  is  the  same  at  any  point  of  the  beam  cross  section.  This  is  possible 
only  if  the  beam  is  bent  as  is  shown  in  the  figure.  The  wavelength  of  the  bent  beam  de¬ 
pends  on  the  transverse  coordinate  and  this  just  means  the  spectral  decomposition  of  an 
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incident  light  pulse.  Such  rotating  beam  is  the  useful  visual  representation  of  a  spectral 
decomposition  wave  (SDW)  of  a  short  light  pulse. 

The  spatial  amplitude-and-phase  distribution  in  SDW  corresponds  to  the  ampli- 
tude-and-phase  distribution  in  the  Fourier  image  of  the  pulse.  So  the  SDW  contains  the 
complete  set  of  the  spatially  separated  Fourier  components  of  the  initial  temporal  signal. 
The  length  of  the  SDW  is  equal  to  the  inverse  of  the  frequency  resolution  of  the  used 
spectral  device  and  should  be  much  more  than  the  length  of  the  initial  wave  packet. 

Let  us  direct  onto  the  entrance  of  the  spectral  device  two  short  pulses  -  a  reference 
pulse  (having  simple  shape)  and  a  signal  pulse.  Even  if  these  pulses  are  separated  in  the 
time  scale,  their  SDWs  may  be  superimposed.  Therefore  two  SDWs  may  interfere.  The 
result  of  recording  the  interference  pattern  can  be  called  a  spectral  hologram.  It  can  be 
shown  that  the  spectral  hologram  contains  the  total  record  of  the  signal  pulse  [1]. 

To  reconstruct  a  signal  pulse  we  use  the  double  spectral  device.  The  left-hand  part 
of  this  device  is  the  same  as  the  instrument  depicted  in  fig.  1,  and  its  the  right-hand  part 
may  be  obtained  by  the  mirror  reflection  of  the  left-hand  part  from  the  spectral  plane  S. 
The  left-hand  part  forms  the  SDW  of  the  incident  time  signal  on  the  spectrum  plane 
while  the  right-hand  part  executes  the  inverse  operation  of  composing  of  frequency  com¬ 
ponent.  As  a  result  we  again  obtain  temporal  signal  at  the  exit  of  the  double  spectral  de¬ 
vice.  We  put  the  developed  spectral  hologram  in  the  spectrum  plane,  and  we  direct  the 
reference  pulse  onto  the  entrance  of  the  device.  The  SDW  of  a  reference  pulse  strikes 
the  hologram  and  diffracts  from  the  hologram  inhomogeneities.  As  a  result  we  obtain  a 
new  wave.  It  may  be  shown  that  this  wave  contains  the  replica  of  the  SDW  of  a  signal 
pulse.  After  spectral  composing  this  radiation  by  the  right-hand  part  of  the  device,  we 
obtain  the  replica  of  the  signal  pulse  stored  on  the  hologram.  Besides,  we  may  obtain  its 
time-reversed  replica. 

The  reconstruction  of  optical  temporal  signals  by  diffraction  of  spectral  decomposi¬ 
tion  waves  was  proposed  in  [2].  These  results  were  further  developed  and  reviewed  [3,4]. 
The  first  experiments  were  done  in  [5,6].  Considered  area  may  be  called  spectral  holo¬ 
graphy. 

There  is  a  close  analogy  between  the  transformations  of  time-varying  signals  by  fil¬ 
tering  their  SDWs  and  the  methods  of  normal  Fourier  optics.  Using  this  analogy  it  is 
possible  to  develop  Fourier  optics  for  temporal  signals  from  the  well-known  principles  of 
Fourier  optics  for  steady  state  radiation.  First  it  concerns  with  the  Fourier  transform  ho¬ 
lography  that  is  the  most  powerful  method  of  the  Fourier  optics  [7]. 

The  natural  consequence  of  the  analogy  of  spectral  holography  with  the  normal  ho¬ 
lography  is  the  methods  of  dynamic  spectral  holography  and  spectral  nonlinear  optics  [8]. 
SDWs  of  several  light  pulses  may  be  superimposed  in  a  common  optical  nonlinear  me¬ 
dium.  This  makes  possible  the  dynamic  interaction  of  SDWs.  As  the  result  of  interaction 
a  new  wave  may  generate.  It  can  be  shown  that  this  newly  generated  wave  usually  has  the 
properties  of  a  SDW  of  a  certain  wave  packet.  This  new  wave  packet  can  be  obtained  in 
the  real  form  with  the  use  of  a  spectral  device  operating  on  backward  waves.  The  spec¬ 
trum  of  a  newly  generated  pulse  may  be  considered  as  the  product  of  spectra  that  partici¬ 
pate  in  the  interaction.  Therefore  the  newly  generated  temporal  signal  is  actually  the 
result  of  fast  convolution  and/or  correlation  of  the  initial  temporal  signals. 

In  addition,  interactions  of  SDWs  with  the  monochromatic  inhomogeneous  waves 
are  possible.  We  may  consider  the  monochromatic  wave  as  produced  by  the  optical  Fou¬ 
rier  transform  of  some  spatial  signal  (coherent  1-D  image).  In  this  case  we  obtain  convol¬ 
ution  or  correlation,  in  which  both  temporal  and  spatial  signals  participate.  The  result  of 
such  transform  may  be  a  spatial  as  well  as  a  temporal  signal.  In  other  words,  we  may  mix 
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spatial  and  temporal  signals  in  real  time  and  obtain  new  spatial  or  temporal  signal.  This 
allows  to  carry  out  dynamic  space-to-time  or  time-to-space  transformations  of  signals 
[8,4,5]. 

The  upper  bond  for  the  duration  of  optical  pulses  that  can  be  processed  by  the 
methods  of  spectral  holography  and  spectral  nonlinear  optics  is  imposed  by  the  fre¬ 
quency  resolution  of  a  spectral  device.  While  using  a  biconfocal  Fabry-Perot  interfe¬ 
rometer  [9]  this  bond  may  be  as  great  as  hundreds  of  nanoseconds. 

Here  we  examine  in  more  detail  dynamic  time-to-space  transformation  using  sum- 
frequency  generation  of  two  SDWs. 

Consider  optical  device  described  in  the  fig.  2.  The  left-hand  part  of  this  device  is 
the  optical  spectral  instrument  such  as  depicted  in  fig.  1.  The  SDWs  of  the  incident  pul¬ 
ses  are  formed  near  the  spectrum  plane  S.  This  plane  is  also  the  entrance  plane  of  the 
optical  Fourier  processor,  which  is  the  right-hand  part  of  the  fig.  2  device.  The  lens  L  of 
the  right-hand  part  of  the  device  transforms  spatial  distribution  of  monochromatic  field 
on  the  S  plane  into  its  Fourier  image  on  the  F  plane. 


Let  two  wave  packets  (optical  pulses)  WP  enter  the  device  of  fig.  2.  One  of  the  pul¬ 
ses  is  the  reference  one  and  the  other  is  the  signal  one.  The  SDWs  of  two  pulses  are 
superimposed  in  the  spectral  plane  S.  The  spectral  device  of  is  so  arranged  that  the  di¬ 
rections  of  the  spectral  dispersions  for  two  considered  pulses  are  opposite.  This  means 
that  the  "blue"  end  of  one  spectrum  on  the  S  plane  corresponds  to  the  "red"  end  of  an¬ 
other  spectrum.  This  also  means  that  SDWs  of  the  two  pulses  rotate  in  the  opposite  di¬ 
rections.  In  our  example  this  is  realized  by  projecting  two  pulses  on  the  diffraction 
grating  G  from  the  different  directions.  Due  to  this,  two  light  sources  move  over  the  grat¬ 
ing  with  the  same  speed  but  in  the  opposite  directions,  giving  rise  to  the  two  parallel 
beams  near  the  S  plane  that  rotate  with  the  same  speed  but  in  the  opposite  directions. 

We  place  a  plate  of  the  quadratic  nonlinear  medium  N  in  the  vicinity  of  the  spec¬ 
tral  plane  S  and  examine  the  interaction  of  two  SDWs  in  this  medium.  We  may  divide 
the  nonlinear  medium  into  elementary  volumes,  determined  in  the  transverse  direction 
by  the  spatial  resolution  of  optical  system  and  in  the  longitudinal  direction  by  the  depth 
of  field.  Radiation  interaction  in  any  of  these  small  volumes  may  be  thought  of  as  the  in¬ 
teractions  of  plane  monochromatic  waves.  Due  to  the  interaction  of  two  SDWs  in  the 
quadratic  medium  we  obtain  sum-frequency  generation  in  every  elementary  volume. 
Since  the  changes  of  frequency  for  two  SDWs  along  the  transverse  co-ordinate  have  the 
opposite  directions,  we  obtain  new  radiation  whose  frequency  does  not  depend  on  the 
transverse  co-ordinate,  i.e.,  we  obtain  monochromatic  wave.  The  spatial  distribution  of 
the  amplitude  and  phase  for  the  newly  generated  monochromatic  radiation  over  the  S 
plane  is  essentially  the  same  as  the  corresponding  distribution  in  the  SDW  of  the  signal 
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wave  packet.  So  we  the  newly  generated  monochromatic  wave  is  the  analog  of  the  Fou¬ 
rier  image  of  the  signal  pulse. 

It  is  not  difficult  to  see  that  the  amplitude-and-phase  distribution  of  monochro¬ 
matic  radiation  in  the  F  plane  repeats  the  temporal  amplitude-and-  phase  structure  of 
the  signal  pulse.  So  we  obtain  dynamic  transformation  of  the  temporal  signal  into  the 
spatial  signal.  The  time  required  for  this  transformation  is  limited  only  by  the  light  veloc¬ 
ity  and  the  length  of  the  optical  device. 

Consider  some  results  of  our  experiments.  The  source  of  ultrashort  light  pulses  was 
the  Nd-YAG-laser  with  mode  synchronization  emitting  ultrashort  pulses  with  the  dura¬ 
tion  of  about  240  psec  and  wavelength  1.064  run.  These  pulses  served  as  the  reference 
ones.  Signal  pulses  with  complicated  shape  were  formed  from  the  initial  laser  pulses 
using  multiple  reflections  with  time  delays.  To  obtain  spectral  resolution  sufficient  for 
the  operating  with  nanosecond  pulses  we  used  Fabry-Perot  etalon  with  the  side  entrance 
as  the  dispersive  element  [4]. 

The  example  of  dynamic  time-to-space  transformation  is  presented  in  the  fig.  3. 
Here  J(t)  is  the  shape  of  the  initial  temporal  signal  and  J(x)  is  the  spatial  distribution  of 
monochromatic  radiation  obtained  after  the  described  transformations. 


Fig.  3. 
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Abstract.  The  concept  of  ’’spatial  amplification”  based  on  transverse  effects 
in  optical  bistability  is  proposed.  The  possibility  is  experimentally  shown  to 
detect  with  the  help  of  a  ’’spatial  amplifier”  weak  optical  signals  of  a  power 
that  is  lO'^-lO®  times  less  than  the  power  of  noise. 


1.  Introduction 

In  a  number  of  algorithms  for  optical  information  processing  one  needs  to  detect  the 
presence  or  absence  of  information  at  least  in  one  pixel  of  a  wide-aperture  matrix.  Since 
the  ”0”-level  signal  power  in  optical  systems  is  not  equal  to  zero,  the  power  of  the 
background  signal  coming  from  the  whole  matrix  is  many  times  higher  than  the  power 
of  the  information  signal  in  any  individual  pixel.  Therefore,  recording  of  a  weak  signal 
is  a  serious  problem  and  a  challenge  to  experiment. 

In  the  present  contribution  we  propose  the  concept  for  increasing  the  signal-to- 
noise  ratio  in  2D  optical  information  processing  systems,  which  is  based  on  application  of 
transverse  effects  in  optical  bistability  (OB).  This  concept  allows  to  develop  a  technique 
for  identification  of  the  states  of  separate  information  pixels  in  wide-aperture  2D  systems 
for  information  processing. 


2.  Principle  of  operation 

The  basic  idea  of  the  concept  proposed  is  to  provide  favourable  conditions  for  the  emer¬ 
gence  and  propagation  of  switching  waves  in  optical  bistable  layers  [1,2]  if  the  switch-on 
threshold  has  been  exceeded  at  least  in  one  local  area.  This  leads  to  a  rnany-Iold 

^  E-mail:  dopit%bas02.basnet.minsk.by@deinos.su 
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Fig.  1.  Experimental  setup.  OB  TFI  “  optically  bistable  thin-film  interferometer; 

L  -  lenses;  BS  -  beam  splitter;  PD  -  photodiode;  CCD  -  CCD  or  TV  camera; 

/^(r)  and  -  intensity  profiles  of  signal,  holding  and  output  beams  respectively; 

dashed  line  shows  output  profile  when 

increase  of  output  light  power  up  to  the  highest  value  which  corresponds  to  total  switch¬ 
ing  of  the  whole  bistable  plate. 

Using  an  OB-inatrix  formed  on  an  optically  uniform  nonlinear  layer  one  can  switch 
on  not  only  a  specified  pixel  but  its  nearest  surrounding  as  well,  and  thus  the  resulting 
output  signal  which  is  registered  and  analyzed  can  be  increased  many-fold.  The  inte¬ 
gral  output  beam  is  further  directed  either  to  photodetectors,  when  an  optoelectronic 
information  processing  system  is  employed,  or  to  an  optical  threshold  device,  when  an 
all-optical  computing  system  is  considered. 

3.  Experimental 

The  technique  proposed  has  been  tested  by  modeling  a  ’’spatial  brightness  amplifier” 
experimentally  using  a  thin-film  bistable  ZnS-interferometer  (TFI)  [3].  The  experimental 
layout  is  shown  schematically  in  Fig.  1.  The  samples  used  were  prepared  under  special 
conditions  which  provided  high  uniformity  of  optical  parameters  over  the  whole  optical 
aperture  of  about  36  mm. 

Inhomogeneity  of  the  nonlinear  optical  characteristics  of  the  TFI  was  examined  in 
the  bistable  mode  of  operation.  Switching  thresholds  of  bistable  pixels  were  measured 
at  seven  equidistant  points  along  the  sample’s  diameter  in  at  least  two  perpendicular 
directions.  A  set  of  bistable  loops  for  different  points  with  corresponding  switch-on  and 
switch  off  thresholds  registered  by  oscilloscope  is  presented  in  Fig.  2.  One  can  see  that 
the  relative  spread  of  measured  threshold  powers  does  not  exceed  ~  10%. 

Spatial  resolution  of  our  21)  bistable  device  is  determined  as  the  highest  packing 
density  of  simultaneously  and  independently  operating  channels  and  it  has  been  measured 
following  the  technique  described  in  [4].  As  a  natural  resolution  criterion  one  can 
consider  the  minimal  distance  between  nearest  bistable  pixels  when  the  information 
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Fig.  2.  Spread  in  threshold  powers 
of  bistable  TFI  along  its  diameter; 
oscilloscope  drawing. 
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Intensity,  arb. units 
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Fig.  3.  Intensity  profiles  of  beams: 
(1)  -  signal  (2)  -  output 

when  J^(r)  =”0”,  (3)  -  out¬ 
put  when  I^{r)  =”1” 


Power,  arb. units 


Fig.  4.  Kinetics  of  integral  output 
power  of  the  beams:  (1)  -  signal,  (2) 
-  output  when  I^{r)  =”0”,  (3)  and 
(4)  -  output  when  {r)  =”1” 


state  of  any  pixel  is  not  influenced  by  the  states  of  its  4  neighbours,  provided  all  of 
them  are  controlled  by  holding  intensities  within  bistable  region.  For  individual  pixels 
with  a  size  of  about  10  /tm  the  measured  spatial  resolution  is  350. ..400  linesicm,  that 
corresponds  to  a  total  number  of  (1.2...1.6)'10®  pixels  over  the  whole  aperture  of  TFI. 

While  modelling  the  "brightness  amplifier”  experimentally  we  formed  in  the  plane  of 
the  bistable  interferometer  a  certain  distribution  of  incident  light  intensity  that  consisted 
of  a  relatively  wide  illuminated  area  with  approximately  uniform  intensity  distribution 
and  a  sharply  focused  gaussian  beam  with  diameter  of  about  lO/xm.  The  size  of  the  wide 
area  was  limited  by  the  maximum  power  of  holding  laser  beam.  In  our  experiment  the 
output  beam  of  an  Ar-laser  with  power  of  500. .  .600  mW  could  illuminate  an  area  of 
diameter  ~  180  fim.  The  second  beam  modelled  an  individual  pixel,  the  information 
state  of  which  is  to  be  detected.  In  order  to  form  the  above  distribution  of  input  light 
the  laser  beam  was  split  into  two  beams.  One  of  them  (with  lower  intensity)  was  directly 
focused  by  the  lens  onto  the  surface  of  the  TFI.  The  other  (with  higher  intensity)  was  first 
widened  by  an  additional  lens.  The  power  of  each  beam  was  controlled  independently. 


4.  Results  and  discussion 

The  main  experimental  results  are  presented  in  Figs.  3-4. 

Fig.  3  shows  the  intensity  profile  of  the  signal  beam  (1),  the  output  distribution  of 
background  intensity  on  the  whole  illuminated  area  (2),  and  the  output  intensity  profile 
(3)  which  indicates  the  presence  of  signal  somewhere  within  the  switched  area.  On  Fig. 
4  time  dependencies  are  plotted  for  the  power  of  the  signal  beam  to  be  detected  (1)  and 
for  the  power  of  output  light  beam  when  there  is  no  signal  (2),  and  while  detecting  a 
local  signal  with  maximum  intensity  that  is  10%  (3)  and  20%)  (4)  of  the  background 
intensity. 
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For  the  OB  device  operating  as  a  ’’spatial  brightness  amplifier”  the  gain  factor  K 
can  be  defined  as  the  ratio  of  integrated  output  power  increase  in  the  channel  of  the 
holding  beam  (or  special  probe  beam),  resulting  from  the  emergence  of  ON-state  in  one 
of  the  pixels,  to  the  integrated  power  of  the  latter. 

SI%  -  Sl^T^  _  S  /Ft  \ 

“  6SPT^  ~  6SI^[ti  ~  j’ 

where  6S  and  -  are  the  square  of  cross-section  and  intensity  of  the  signal  light  beam 
which  switches  the  pixel  being  detected,  -  intensity  of  holding  beam  (with  a  uniform 
distribution),  S  -  the  square  of  switched-on  area,  and  Tj  -  transmission  of  bistable 
device  in  switched-on  (upper)  and  switched-olT  (lower)  states.  In  our  experiment  6S  ~ 
75^m^  5  ~  2.7  •  /I^  ~  0.1,  Tj  ~  60%,  T|  25%.  Thus,  experimentally 

the  obtained  gain  factor  of  our  ’’spatial  brightness  amplifier”  was  about  5-10^  and  was 
limited  by  the  output  power  of  the  laser  available. 

The  above  expression  allows  to  estimate  also  the  gain  factor  that  can  be  reached 
when  the  holding  beam  totally  covers  the  whole  operating  area.  The  diameter  of  the 
illuminated  spot  is  then  36mm,  whereas  the  diameter  of  the  signal  beam  and  the  intensity 
ratio  are  assumed  to  be  lO/rm  and  j  1^  ~  0.1,  as  in  the  experiment.  The  estimated 
limit  of  the  gain  factor  reaches  the  value  77  1 .8  •  10®. 

The  gain  factor  of  a  ’’spatial  amplifier”  can  be  also  increased  at  the  expense  of 
increasing  the  precision  of  intensity  control  (then  a  lower  value  of  the  ratio  can  be 

reached).  But  this  requires  much  better  stabilization  of  the  holding  beam,  both  in  its 
power  and  spatial  uniformity,  than  that  reached  in  our  experiment. 

The  concept  of  ’’spatial  brightness  amplification”  has  been  used  also  for  analysis  of 
output  light  fields  of  the  coherent  correlation  processor. 
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Abstract.  A  new  form  of  heterodyning  in  an  all-optical  bistable  system,  related 
to  stochastic  resonance,  is  reported.  It  is  shown  theoretically  and  experimentally 
that  the  heterodyne  signal  and  signal-to-noise  ratio  can  be  enhanced  by  the  addi¬ 
tion  of  noise  to  the  input  signal. 


1.  Introduction 

Prospective  applications  of  optical  bistability  (OB)  in  digital  optical  computing  and 
signal  processing  are  related  to  the  development  of  miniaturised  devices  with  reduced 
threshold  power  [1].  This  implies  that  noise  is  one  of  the  important  limiting  factors  (see, 
for  example,  [1])  and  highlights  the  problem  of  controlling  the  signal-to-noise  ratio  (SNR) 
in  OB  devices.  It  is  thus  of  particular  interest  to  investigate  in  OB  systems  phenomena 
related  to  stochastic  resonance  SR  [2]  in  which  the  signal  amplitude  and  SNR  can  be 
increased  rather  then  suppressed  by  adding  noise  to  the  system. 

Most  studies  of  SR  to  date  (including  those  for  OB  systems  [3-5])  have  related  to 
signals  of  low  frequency  [6],  i.e  less  than  the  reciprocal  relaxation  time  of  the  system. 
It  was  shown  recently  [7],  however,  that  in  the  practically  important  case  where  two 
high-fiequeiicy  signals  are  mixed  nonlinearly,  SR  can  be  used  to  strongly  increase  the 
amplitude  of  the  resulting  heterodyne  signal  and  also  to  enhace  its  SNR. 

In  this  paper  we  extend  the  idea  to  obtain  a  new  form  of  optical  heterodyning  noise- 
protected  with  stochastic  resonance.  In  Sec.2  a  theory  of  noise-enhanced  optical  hetero¬ 
dyning  for  the  case  of  noise  on  the  intensity  of  the  input  signal  is  outlined  (the  case  for 
noise  on  the  reference  beam  will  be  considered  elsewhere).  In  Sec.  3  experimental  re¬ 
sults  demonstrating  an  increase  of  the  heterodyne  signal  and  SNR  with  increasing  noise 
intensity  are  presented.  Sec. 4  contains  concluding  remarks. 
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2.  Theory 

The  system  considered  is  a  bistable  interferometer  with  intensity  dependent  phase  gain 
(j)  driven,  in  addition  to  the  resonant  signal  beam,  by  a  nonresonant  reference  beam.  The 
intensity  It  of  the  transmitted  resonant  radiation  is  related  to  the  input  intensity  /jn  via 
a  periodic  transmission  coefficient  N{(f)).  The  phase  gain  is  assumed  to  depend  linearly 
both  on  the  intensity  of  the  resonant  intracavity  radiation  and  on  the  intensity  /ref  of 
the  reference  beam. 

Assuming  that  the  intensity  An  of  the  resonant  radiation  has  a  modulated  high- 
frequency  component  and  zero-mean  gaussian  noise  AI{t),  whereas  Aef  has  an  unmod¬ 
ulated  high-frequency  component,  the  dynamics  of  the  system  can  be  described  by  a 
Debye  relaxation  equation  for  the  phase  gain  (j) 


^  +  =  (/i„  +  Ai„(()cos(wo<  +  ^^>(0)  + +  +  ^.fCOStJoi,  (1) 

T 

Here  Ad  =  where  0o  is  the  phase  in  the  dark,  A-ef  and  An  are  constant  components 

in  the  intensities  of  the  two  laser  beams,  Aref  is  the  amplitude  of  the  reference  signal, 
and  Ain(0  modulated  amplitude  and  phase  of  the  input  signal.  The 

periodic  functions  and  N[(i))  characterize  the  nonlinear  response  of  the  system  and 

have  well-known  forms  for  simple  models  (cf.  [1]). 

If  the  characteristic  frequencies  of  modulation  (I  ^  'ip  and  Ajn/Ain  are  low  compared 
the  motion  of  the  system  consists  of  fast  oscillations  at  frequency 
UJO  superimposed  on  a  slower  (averaged  over  St  =  2iiIuq)  motion  which  can  be  described 
as  overdamped  Brownian  motion,  with  the  coordinate  (cf.  [7,8]),  in  a  bistable 

potential 


q.  =  /t  IdF/dx^^^A  sinV^(t)  +  A/(f),  A  =  ArefAin(f)/2wo 

(2) 

I  =  d4,M-\(t>);  U{x)  =  -7i.„r  +  -  M  -  4ef)  • 

Here  F(x)  =  For  small  modulation  amplitude  /li„/aio, <  1,  het¬ 
erodyning  can  be  characterized  by  the  amplitude  of  tlie  low-frequency  signal  for 

sinusoidal  modulation,  /la,  =  const,  </>(«)  =  ff,  and  standard  linear  response  theory  can 
be  applied  to  the  analysis  of  Eqs.  (l)-(2).  In  doing  so  one  has  to  find  the  susceptibility 
X(tl)  for  the  averaged  intensity  of  transmitted  radiation 

{h-(t))=  r«,Jr»  +  Ira[\(fi).dexp(-!:O01, 

n=l,2 

where  iVn  axe  the  stationary  populations  oC  the  states.  For  low  noise  intensity  D  the 
susceptibility  is  given  by  the  sum  of  partial  contributions  from  vibrations  about  the 
stable  states  fA;,  =  hnNW^i'n)).  weighted  by  ny,,  and  from  the  interwell  transitions 


Here,  Wnm  is  the  probability  of  the  transition  n  ^  m  between  the  stable  states  for 
Ain  =  ^ref  =  0.  The  dependence  of  W21  on  the  noise  intensity  D  is  of  the  activation 
type,  and  therefore  the  amplitude  of  heterodyne  signal  (oc  |x(^)l)  may  increase  with 
increasing  noise  intensity.  The  SNR  for  the  heterodyning  can  be  characterized  by  the 
ratio  R  of  the  low-frequency  signal  in  the  intensity  of  the  transmitted  radiation,  given 
by  to  the  value  of  the  power  spectrum  at  the  same  frequency  for 

A  =  0  which  can  be  evaluated  in  a  similar  way  to  x(H).  For  uj  <C  U"(<f)n)  it  has  the  form 

E  +  +  (4) 

n=l,2  ^  J 

j  12  IT12  +  VF21 

(1Ti2  +  1T2i)2+u;2- 

It  follows  immediately  from  (3),  (4)  that,  for  H  not  only  the  heterodyne 

amplitude,  but  also  R  can  indeed  increase  with  increasing  noise  intensity  in  the  range  of 

/in,  /ref  lor  which  Wi  ^  W2. 


3.  Experiment 

These  predictions  were  first  tested  quantitatively  by  electronic  analogue  simulation  [7] 
for  the  case  of  overdamped  motion  in  the  symmetric  bistable  Dufhng  potential,  for 
F(x)  =  X.  Noise-enhanced  heterodyning  was  thereby  demonstrated  for  both  low-  and 
high-frequency  noise,  in  good  agreement  with  theory. 

We  now  report  an  optical  experiment  based  on  a  double-cavity  membrane  system 


(DCMS)  [8]  was  investigated,  consisting  of 
separated  from  a  dielectric  mirror  by  a  metal 


NOISE 

Fig.l  The  signal-to- noise  ratio  R,  also 
as  a  function  of  noise  intensity  (arbitrary  u- 
nits).  The  inset  shows  magnitude  of  the  het¬ 
erodyne  signal  for  loq  =  2.1kHz,  H  =  3.92  Hz 
as  a  function  of  noise  intensity  . 

The  measured  relaxation  time  r,-  of  the  DCT 
n  7-“^  <C  u;o  the  reference  signal  (488  nm) 


a  thin  ~  Igm  GaSe  semiconductor  film 
diaphragm  ~  500jum  in  diameter. 

The  incident  beam  from  an  argon  laser, 
of  wavelength  514.5  nm,  propagating  a- 
long  the  normal  to  the  mirror  provides 
an  input  signal.  An  additional  beam  of 
wavelength  488  nm  from  the  argon  laser, 
inclined  with  respect  to  the  DCMS  axis, 
provides  a  reference  signal.  The  intensi¬ 
ties  of  the  laser  beams  are  modulated  in 
time  by  two  electro-optic  modulators,  to 
which  periodic  signals  and  noise  are  ap¬ 
plied.  Optical  bistability  arises  because 
of  thermoelastic  bending  of  the  mem¬ 
brane  caused  by  the  main  (514.5  nm) 
laser  beam.  The  phase  gain  of  the  air- 
resonator  is  linear  in  bending  and  thus 
follows  adiabatically  the  thermal  relax¬ 
ation  of  the  film.  Heating  of  the  DCMS 
by  the  488  nm  reference  signal  is  directly 
proportional  to  its  intensity. 

S  was  ~2  ms.  Thus  to  met  the  condition 
i^as  modulated  periodically  at  a  frequency 
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CJO  =  2.1kHz, while  the  input  signal  (514.5  nm)  was  modulated  at  ujq  +  0,  with  H  =  3.92 
Hz  and  in  addition  by  noise  with  a  cutoff  frequency  of  5  kHz.  A  heterodyne  signal  at 
frequency  0  =  3.92  Hz  was  detected  in  the  transmitted  light  intensity  It  at  wavelength 
514.5  nm.  With  noise-induced  fluctuationaltransitions  occurring  between  two  stable 
states  of  the  DC  MS,  a  strong  heterodyne  signal  appeared,  superimposed  on  the  zero- 
frequency  Lorenzian  peak  in  the  spectral  density  of  fluctuations  of  /j  [8]. 

Strong  enhancements  of  both  the  heterodyne  signal  (by  a  factor  of  ~1000)  and  of 
the  signal-to-noise  ratio  R  were  observed  with  increasing  noise  intensity,  as  shown  in 
Fig.l.  The  dependence  of  R  on  the  noise  intensity  is  of  the  characteristic  reversed-7V 
shape  familiar  from  earlier  studies  of  SR  in  bistable  systems  [6]  and  consistent  with  the 
theory  outlined  above.  The  enhancement  of  the  SNR.  occurs  within  a  restricted  range 
of  noise  intensities,  as  expected,  and  the  ratio  between  the  value  of  R  at  the  minimum 
to  that  at  the  local  maximum  (i.e.,  the  maximum  noise-induced  “amplification”  of  the 
signal-to-noise  ratio)  is  ^10. 


4.  Conclusion 

In  conclusion,  we  have  predicted,  and  observed  experimentally  in  an  all-optical  system, 
the  new  and  potentially  useful  phenomenon  of  noise-enhanced  optical  heterodyning. 
Similar  effects  can  be  mediated  by  the  thermal  noise  in  the  device.  A  fuller  discussion 
and  a  detailed  quantitative  comparison  with  the  theory  will  be  given  elsewhere. 
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Abstract.  The  paper  addresses  the  design  of  an  opto-electronic  implementa¬ 
tion  of  a  stochastic  bit-stream  neural  system  which  operates  by  manipulating 
digital  bit  streams  to  create  emergent  activation  functions  using  extremely 
simple  logic. 


1.  Introduction 

This  paper  investigates  the  possible  strategies  for  implementing  a  bit  stream  stochas¬ 
tic  neural  network  (BSN)  design  [6]  making  use  of  the  latest  opto-electronic  technolo¬ 
gies  [3,  4].  The  network  design  operates  by  manipulating  digital  bit  streams  to  create 
an  emergent  activation  function  using  extremely  simple  logic.  The  neural  design  and 
implementation  strategy  will  potentially  yield  a  number  of  impressive  benefits. 

•  The  optical  connections  have  a  very  high  bandwidth  which  can  overcome  the  I/O 
bottleneck  in  VLSI  implementations  of  the  stochastic  neural  design  [5]. 

•  The  stochastic  approach  introduces  real  values  through  a  precisely  controlled  prob¬ 
abilistic  technique,  which  makes  possible  a  complete  and  exact  mathematical  de¬ 
scription  and  simulation  of  the  network  functionahty. 

•  In  contrast  to  analogue  implementations,  digital  networks  can  be  combined  with¬ 
out  introducing  further  uncertainties  in  the  accuracy  of  the  computation.  Hence 
implementations  can  be  scaled  up  without  major  modifications. 

•  Both  the  speed  and  the  digital  nature  of  the  basic  operations  will  mean  that  ef¬ 
fective  on-chip  learning  may  be  incorporated  into  the  design,  by  exploiting  the 
stochastic  properties  of  the  network  operation. 

2.  Bit  stream  neural  design 

A  standard  neural  design  involves  a  network  of  neurons  each  calculating  the  weighted 
sum  of  its  inputs  and  then  applying  a  sigmoid-like  function  to  the  result.  Hence,  one  of 
the  fundamental  problems  inherent  in  a  massively  parallel  implementation  of  a  neural 
architecture  is  how  to  multiply  together  the  real  inputs  and  their  corresponding  weights 
without  resorting  to  cumbersome  bit-parallel  digital  circuitry.  The  analogue  solution  to 
this  problem  incurs  a  number  of  difficulties,  including  relatively  low  resolution,  cross¬ 
chip  variations  in  component  performance,  and  the  general  problems  in  constructing 
large-scale  reliable  systems. 
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The  stochastic  bit-stream  approach  combines  the  benefits  of  digital  circuitry  and 
analog  simplicity.  In  this  approach  real  values  are  represented  by  stochastic  bit  streams. 
Such  a  bit  stream  is  a  sequence  of  O’s  and  I’s  with  a  fixed  probability  p  of  a  1  occurring. 
They  are  used  to  represent  the  real  value  p  in  the  interval  [0, 1]  for  unsigned  values, 
or  in  an  alternative  representation  the  value  (2p  —  1)  in  the  interval  [  —  1,1]  for  signed 
values  [2].  Each  stochastic  input  bit  to  a  neuron  is  weighted  by  either  ANDing,  when 
operating  on  unsigned  values,  or  XORing,  when  operating  on  signed  values,  with  a 
corresponding  weight  bit.  One  bit  of  each  of  the  individual  weighted  input  streams  is 
summed  and  compared  with  a  threshold  value  to  determine  one  bit  of  the  output  stream. 
The  interaction  of  the  probability  distributions  of  the  sum  and  the  thresholds  creates  a 
sigmoid-like  functionality,  which  can  be  precisely  described  [5].  The  design  of  a  single 
neuron  is  therefore  extremely  simple,  making  it  possible  to  map  a  large  network  onto  an 
established  implementation  technology. 

The  functionality  of  the  neuron  has  been  demonstrated  through  various  simula¬ 
tions.  A  handwritten  digit  recognition  task  has  been  chosen  to  test  the  learning  and 
generalization.  A  clean  set  of  500  16  X  16  pixel  samples  chosen  from  the  NIST  database 
were  used.  The  fuUy  connected  feed-forward  network  has  256  input  neurons,  12  hidden 
neurons  and  10  output  neurons.  It  has  been  successfully  trained  on  400  input  patterns. 
This  was  achieved  with  a  network  structure  of  comparable  complexity  to  a  relatively 
compact  sigmoid  network  for  the  same  problem.  The  error  (the  vector  distance  between 
outputs)  on  the  training  sample  was  less  than  0.1  for  each  example,  though  this  was 
larger  than  that  achieved  with  the  sigmoid  network.  The  generalization  error  for  the 
network  was  slightly  larger  than  for  the  sigmoid  network,  which  appears  to  contradict 
the  conjecture  that  the  functionality  of  a  BSN  network  is  more  restricted. 

Simulations  of  BSN  networks  have  also  demonstrated  the  application  of  the  Mean 
Field  Annealing  algorithm  to  the  problem  of  graph  bisection.  The  quality  of  solution 
and  number  of  iterations  is  comparable  with  standard  neurons.  Massively  parallel  BSN 
implementations  could  process  large  graphs  in  real  time. 

3.  The  Two  Optical  Architectures 

In  this  paper  we  will  present  two  opto-electronic  implementations  of  the  neural  archi¬ 
tecture.  They  are  illustrated  in  figure  1.  One  design  is  fully  spatially  multiplexed  (max¬ 
imally  parallel),  whilst  the  other  introduces  the  extra  dimemsion  of  time  multiplexing. 
A  neuron  in  a  time  multiplexed  system  receives  the  first  bit  from  each  connection  in 
turn  and  only  delivers  an  output  bit  when  all  have  been  processed.  This  considerably 
simplifies  the  design  of  the  neurons,  but  at  the  expense  of  their  operational  speed.  It 
also  simplifies  the  beanilet  optics  for  a  given  number  of  neurons  when  comjiared  to  the 
fully  spatially  multiplexed  design. 

3.1.  Overview  of  optical  components 

Both  of  the  optical  implementations  require  image  replication  and  beam  handling  optics. 
Image  replication  has  been  demonstrated  [4]  iising  computer  generated  holograms  in  con¬ 
junction  with  bulk  lens  systems  to  produce  output  arrays  consisting  of  up  to  1000  x  1000 
beamlets.  Beam  detection  is  achieved  by  the  use  of  silicon/GaAs  photodetectors,  and 
modulation  is  possible  through  the  use  of  various  opto-electronic  devices.  Modulation  is 
most  easily  achieved  with  liquid  crystal  spatial  light  modulators,  but  only  relatively  slow 
speeds  in  the  region  of  10  MHz  may  be  realised.  High  speed  modulation  can  l)e  achieved 
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by  the  use  of  devices  such  as  LED  based  photothyristors  [1]  featuring  switch  off  times  less 
than  10ns,  or  arrays  of  individually  addressable  surface  emitting  semiconductor  lasers  [3] 
operating  at  speeds  of  up  to  1  GHz.  The  choice  of  output  modulator  will  depend  on  the 
architecture  used.  Those  with  a  high  level  of  internal  multiplexing  wiU  have  relatively 
low  output  bit  rates,  these  may  be  achievable  with  SLMs.  Non-multiplexed  designs  with 
high  output  bit  rates  lOOMHz)  will  require  the  use  of  fast  output  modulators. 


Figure  1.  The  spatial/time  multiplexing  scale  of  implementations 

3.2.  Detail  of  spatial  architecture 

In  the  fuUy  spatial  implementation  each  neuron  receives  all  of  its  inputs  values  and  their 
corresponding  weights  in  parallel.  On  a  given  operational  cycle  a  neuron  will  receive  one 
bit  from  each  weight  and  input  connection.  These  input/weight  pairs  are  first  multiplied, 
the  choice  of  AND  or  XOR  will  depend  on  the  actual  stochastic  representation  being 
used.  The  resulting  weighted  input  bits  must  now  be  accumulated,  in  parallel,  prior 
to  thresholding  and  the  generation  of  a  stochastic  output  bit.  Accumulation  may  be 
achieved  in  several  ways. 

The  first  approach  is  to  focus  the  n  weighted  bits  onto  a  single  analogue  detector 
that  produces  an  analogue  output  proportional  to  the  number  of  I’s.  Current  analogue 
detectors  of  this  type  can  resolve  values  for  n  of  64  whilst  operating  at  speeds  around 
lOMHz.  A  second  purely  digital  approach  would  involve  the  use  of  fast  adders  incor¬ 
porating  carry  save  and  look-ahead  techniques.  This  approach  is  not  ideal  as  relatively 
large  amounts  of  circuitry  would  be  required,  which  in  turn  will  impact  performance. 
The  final  approach  is  based  around  a  novel  population  detector  that  directly  exploits 
the  stochastic  nature  of  its  inputs  and  produces  an  appropriately  thresholded  stochastic 
output.  At  the  time  of  writing  the  details  of  this  device  cannot  be  disclosed  pending  a 
patent  application. 

3.3.  Detail  of  time  multiplexing  architecture 

In  this  approach  a  group  of  n  input  bit  streams  is  transmitted  to  each  of  n  neurons  that 
together  form  a  layer.  Such  a  fully  ‘'onnected  layer  will  require  a  total  of  distinct 
weight  bit  streams.  The  example  illustrated  in  figure  1  shows  n  such  layers  operating  in 
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parallel.  Here  we  effectively  have  inputs,  weights  and  outputs.  Each  neuron 
has  two  physical  connections  designed  to  accept  time  multiplexed  input  and  weight  bit 
streams.  Full  connectivity  within  a  layer  is  achieved  by  connecting  together  all  of  the 
physical  input  connections  of  each  of  the  constituent  neurons,  which  is  then  supplied 
with  the  time  multiplexed  input  bits.  The  corresponding  weights  are  separately  time 
multiplexed  and  supplied  directly  to  each  neuron.  These  two  multiplexing  processes  are 
depicted  in  the  circled  areas  in  figure  1.  Once  the  neurons  have  completed  an  operational 
cycle,  i.e.  they  have  received  n  input /weight  bit  pairs  and  have  performed  the  appropriate 
computations,  an  output  bit  is  generated  and  can  be  passed  to  the  next  layer.  The  critical 
components  in  this  approach  are  the  shift  registers  used  to  multiplex  both  the  input  bits 
and  the  weight  bits.  These  subsystems  must  clock  at  n  times  the  bit  rate  of  the  input 
and  output  bit  streams.  Typical  clock  speeds  will  be  of  the  order  of  lOOMHz.  Each 
neuron  is  composed  of  an  AND  or  XOR  gate  to  perform  the  input/ weight  multiplication 
and  a  simple  counter  to  compute  the  sum  and  activation,  see  [6]  for  details. 

4.  Conclusions 

One  of  the  attractions  of  the  stochastic  design  is  its  adaptability  to  hardware  constraints. 
This  means  that  there  is  a  great  deal  of  flexibility  in  the  choice  of  hardware  configurations. 
At  the  cheapest  end  of  the  spectrum,  simple  FPGAs  can  be  used  to  provide  prototyping 
systems  to  verify  design  methodologies  for  larger  systems.  At  the  other  extreme,  this 
paper  has  indicated  how  the  latest  opto-electronic  techniques  can  be  utilised  to  give  very 
high  throughput  speeds  in  a  massively  parallel  mode. 

As  an  example  of  the  flexibility,  we  mention  that  a  neuron  can  easily  generate  a 
bitstream  for  the  derivative  of  its  output  with  respect  to  a  particular  input.  This  opens 
the  way  to  implementing  a  parallel  network  to  calculate  the  backpropagation  of  error 
through  the  main  network  and  hence  perform  gradient  descent  learning  on  chip. 

The  other  very  important  property  of  the  design  is  its  modular  capability.  Since 
the  functionality  can  be  precisely  described,  there  is  no  sense  in  which  uncertainty  or 
error  accumulates  through  layers  of  the  network.  This  contrasts  with  analogue  solutions, 
where  there  is  a  need  to  train  out  errors  in  larger  networks.  Hence,  for  the  bit  stream 
design  there  is  no  obstacle  to  concatenating  several  subnetworks  into  a  larger  system 
and  backpropagating  through  one  network  in  order  to  train  another.  Recent  work  by 
Werbos  [7]  suggest  that  these  more  complex  structures  may  be  the  key  to  realising  the 
full  potential  of  Neural  Networks  in  Neurocontrol  applications. 
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Abstract:  We  present  an  opto-electronic  hardware  system  that  performs  matrix-matrix  multiplication 
using  an  electrically  addressable  spatial  light  modulator  as  the  switching  divice  and  computer 
generated  holograms  for  connection  routing.  This  system  is  well  suited  for  the  implementation  of 
a  quantised  neural  network.  The  operation  of  two  demonstrator  systems  is  described  as  well  as  a  new 
algorithm  for  on-line  learning  with  quantised  weights  and  hard  thresholding  based  on  function 
smoothing  and  stocastic  approximation.  Finally,  some  results  of  test  runs  are  presented. 


1.  Introduction 

The  system  described  here  implements  a  three-layer  perceptron.  Restrictions  imposed  by  the 
binary  nature  of  the  Spatial  Light  Modulator  (SLM)  cause  the  quantisation  of  weights  available 
and  force  hard  thresholding  neurons.  Thus  standard  gradient  descent  algorithms  such  as  Back- 
Propagation  of  Error  (BPE)  do  not  work  and  other  solutions  to  learning  had  to  be  found. 

The  first  is  to  run  a  computer  simulation  of  the  hjirdware  using  continuous  variables  in 
conjunction  with  the  hardware  itself. 

The  second  method  was  the  use  of  the  Stocastic  Approximation  with  function  Smoothing 
(SAS)  algorithm  [1],  futher  developed  at  King’s  College  specifically  for  this  problem  [2].  The 
SAS  algorithm  allows  direct  on-line  learning. 


2.  Demonstration  systems 

Both  demonstrators  perform  the  same  matrix-matrix  multiplication  operation, 

C  =  W-A 

where  A  is  the  input  pattern  to  layer  n  of  the  network,  W  is  the  weight  matrix,  C  is  the 
unthresholded  output  and  •  represents  the  inner  product  operation.  The  control  computer  then 
applies  the  thresholding  function  F  completing  the  pass  from  layer  n  of  the  network  to  layer  n+1, 

where  0„^.|  is  the  output  from  layer  n-f-1.  Next  we  set  and  repeat  the  process 

through  the  same  two  layer  hardware  to  obtain  0„^2  ’  multiplexing  is  used  to 

construct  a  network  of  as  many  layers  as  desired,  in  our  case  three. 
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Figure  1:  Transmissive 
demonstrator.  HI  &  H2  are 
computer  generated  holograms. 


2.1  Transmissive  system 

The  first  demonstrator  to  be  built  was  the  transrr^sive  vo^n  (figure  1)  [4][5].  An  expanded  laso*  beam  is 
fannedoutinto8x8spotsbyhologramHl  which  then  pooeedontoSLMl  whoetheinputpattem  Aisencoded 
asatsiwy  intensity  matrix.  Hol<^ramH2furthernmiltiply  images  theinputrnairix  5x5  tkrtesonto  SLM2  where 
each  pixel  of  the  input  matrix  is  given  a  weightingfactcr  fcr  each  ‘neural’;  the  25  neurons  being  represented 
by  the  25  blocks  on  SLM2.  The  summation  for  each  neuron  is  pcrfonned  optically  at  the  5x5  deteaor  array. 

The  control  computer  applies  the  thresholding  function  F  and  modifies  the  weight  matrix 
according  to  the  learning  rule  in  use.  Operation  is  very  slow  due  to  the  use  of  the  serial  computer 
and  the  slow  SLM  addressing  time. 

The  computer  generated  holograms  [3]  are  phase  elements  etched  into  quartz  and  have  proved 
very  accurate.  In  order  to  encode  the  weights  at  SLM2,  H2  actually  produces  a  2x2  sub-fanout  of  each 
pixel  of  the  input  matrix  of  differing  intensities.  A  ‘maxi-pixel’  at  SLM2  (consisting  of  4  true  SLM 
pixels)  then  selects  different  combinations  of  these  beamlets  to  encode  a  weight  with  16  levels. 

2.2  Reflective  system 

The  reflective  demonstrator  (figure  2)  is  the  same  in  function  as  tlte  transmissive  one  except  folded  in  on  itself 
by  making  hologram  H2  reflective.  This  way  the  hardware  requirements  and  system  size  are  reduced 
This  is  possi  ble  because  the  input  matrix  is  displayed  on  the  central  area  of  the  SLM  whereupon 
H2  reflects  and  multiply  images  onto  the  surrounding  area  on  the  return  leg.  However,  this 
arrangement  now  becomes  much  more  sensitive  to  the  polarisation  states  produced  by  the  SLM  and 
we  find  that  it  is  much  more  difficult  to  get  a  good  contrast  ratio  at  the  output  due  to  the  SLM  producing 
elliptically  polarised  light.  In  addition,  reflections,  which  were  not  a  problem  in  the  transmissive 
system,  are  now  mixed  in  with  the  desirol  signal  and  have  proved  very  troublesome. 


FigureZ:  Reflective  demonstrator.  Note:  some  elements  have  been  omitted  for  clarity. 
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2.3  Neural  network  operation 

Both  demonstrators  are  run  as  a  64-24-9  perceptron,  i.e.  we  have  run  tests  with  9  memories  of  64  bits 
each.  A  standard  network  requires  both  positive  and  negative  weights  yet  an  optical  signal  can  only 
be  positive.  We  circumvent  this  problem  by  defining  a  total  intensity  level  as  the  unchanged  input  and 
calculating  the  negative  weights  for  each  neuron  (that  has  ]X)sitive  weight  connections)  thus: 

Negative=Total  -Positive. 


3.  On-line  learning  methods 

Quantised  weights  and  hard  thresholds  produce  a  sampled  discontinuous  and  hence  non- 
differentiable  error  (cost)  function  in  weight  space  which  make  it  unsuitable  for  minimisation 
using  standard  back  propagation  of  error.  Some  way  round  this  has  to  be  found  in  order  to  proceed. 

3.1.  Back  propagation  of  error  using  a  software  simulation 

If  a  continuous  weight,  smooth  threshold  simulation  of  the  network  is  run  in  software,  it  can  be 
used  to  predict  the  approximate  hidden  and  output  layer  states  using  standard  BPE  with  a 
quantisation  stage  at  the  end.  The  hardware  weights  may  then  be  adjusted  according  to  a  simple 
local  rule  to  produce  the  required  states  [5].  But  this  is  ‘cheating’! 


3.2  Stochastic  approximation  and  function  smoothing 


When  the  sampled  discontinuous  cost  function  is  convolved  with  a  Gaussian  probability  function  G(P,p) 
it  is  transformed  into  a  smooth  differentiable  form  on  which  a  gi'adient  descent  method  may  be  used  to  find 
the  minimum  (figure  3).  In  addition  to  actually  allowing  a  minimisation  to  take  place,  the  gradient  descent 
algorithm  can  avoid  local  minima  if  the  smoothing  coefficient  p  is  started  off  at  a  suitably  high  value. 

This  smoothes  the  cost  function  to  such  an  extent  that  all  the  local  minima  are  ironed  out  and  the  lowest 
point  of  the  function  will  be  in  the  vicinity  of  the  global  minimum  (top  trace  in  figure  3).  As  p  is  reduced  during 
the  search,  the  smoothed  function  gradually  becomes  less  smooth  and  approaches  the  original  cost  function 
while  the  solution  settles  in  the  tme  global  minimum  (botom  tiace  in  figure  3). 

Stochastic  approximation  implements  a  technique  to  estimate  the  gradient  of  the  cost  function  using 
only  a  few  random  samples  rather  than  an  exhaustive  calculation.  The  expectation  of  the  search  trajectory, 
however,  corresponds  to  a  direct  search.  This  reduces  the  numter  of  samples  of  the  function  required,  and 
thus  the  number  of  passes  through  the  hardware  per  learning  iteration,  without  compromising  accuracy. 


Figure  4:  Convolution  with  a 
smoothing  function  G  turns  the  cost 
function  into  a  differentiable  form.  By 
starting  off  with  a  high  p  value  and  then 
gradually  decreasing  it,  a  gradient 
descent  search  will  not  get  stuck  in  a 
local  minimum. 
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4.  Experimental  results 

The  net  was  trained  to  recognise  9  patterns  consisting  of  pixellated  letters  of  the  alphabet  with 
various  levels  of  noise  added  (to  improve  generalisation).  Testing  was  then  earned  out  with  the 
same  memories  with  added  noise  of  up  to  30%. 

The  software  simulation  method  produced  correct  classification  of  near  100%  with  up  to 
10%  input  noise,  dropping  to  60%  with  30%  noise. 

The  S  AS  algorithm  proved  difficult  to  implement  due  to  the  slow  hardware  cycle.  The  drift  of  the 
components  (due  to  tenperaturc  variations)  was  fastCTthan  the  kiaming  time  so  the  hardware  characteristics 
changed  too  fast  for  the  learning  prcmiuie  to  adapt  to  them!  Nevertheless,  limited  success  was  achieved 
with  the  error  reduced  to  33%,  and  software  simulatiais  of  the  hardware  worked  perfectly. 


5.  Conclusion 

The  ajplication  of  a  quantise  optical  matrix-matrix  nuiltiplier  {system  to  a  neural  network  implementation 
has  been  demonstrated  A  novel  learning  scheme  has  been  usai  which  gets  over  the  problem  of  quantised 
weights  and  hard  thresholds  imposal  by  tte  binary  nature  of  the  hardware.  In  the  course  of  learning,  the 
network  also  learns  the  imperfections  in  the  hardware  enabling  a  hi^  degree  of  robustness. 

The  hardware  is  at  present  too  slow  for  on-line  learning  but  it  has  been  demonstrated  that 
it  does  work  and  with  improvements  in  speed  the  SAS  algorithm  would  work  well. 

Computer  generated  holograms  have  been  used  for  beam  routing  and  intensity  coding  and 
have  proved  very  reliable.  Other  advances  in  technology  such  as  smart-pixel  directly  addressable 
SLMs  and  an  SAS  adapted  to  ‘on-chip’  learning  should  allow  much  faster  wholly  parallel 
systems  to  be  developed.  This,  together  with  miniaturisation  along  the  lines  of  the  reflective 
system  with  the  use  of  microlens  arrays,  laser  diode  arrays,  smaller  and  faster  detector  arrays, 
and  smaller-pixel  SLMs  might  produce  systems  for  real  time  direct  image  processing  applications. 

Grey-level  input  can  be  implemoited  in  the  same  way  to  the  weights  arc  coded  on  the  SLM  ‘maxi¬ 
pixels’,  allowing  a  ^eat^  information  cot  tent  in  the  input  pattern  and  relaxing  the  hard-threshold  constraint 
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Abstract.  A  family  of  lateral  inhibition  architectures  which  use  the 
self-linearised  SEED  effect  to  implement  optical  subtraction  are  described,  and 
their  operation  demonstrated  in  simulation. 


1.  Introduction. 

Lateral  inhibition  networks,  where  nodes  within  a  processing  layer  inhibit  one  another,  form  a 
very  important  class  of  networks.  They  are  not  amenable  to  processing  on  conventional  serial 
machines,  since  they  require  some  type  of  iterative  self-consistent  solution,  thus  making  parallel 
implementation  very  attractive.  The  dense,  recurrent  nature  of  the  interconnection  is  ideally 
suited  to  an  optical  approach,  and  the  finite  range  of  the  interconnection  means  there  is  no 
intrinsic  limit  on  scaling  up  the  system. 

The  essence  of  any  inhibition  network  is  that  the  activity  level  of  an  individual  node 
must  decrease  in  response  to  increasing  input  from  neighbouring  nodes  i.e.  subtraction,  which 
is  difficult  to  do  optically.  One  approach  has  been  to  use  an  optically  coupled  pnpn  light  emitter 
and  an  npn  phototransistor.[l]  However,  the  Self  Linearised  SEED  effect,  observed  in  a  serial 
photodiode-SEED  modulator  combination  offers  an  alternative  method  of  optical 
subtraction.  [2,3]  Feedback  provided  by  the  common  current  causes  the  modulator  reflectivity  to 
decrease  as  the  light  falling  on  the  photodiode  increases.  The  small  range  of  modulation  can  be 
improved  by  incorporating  the  SEED  device  in  a  resonant  Fabry -Perot  optical  cavity.  [4]  This 
combination  is  made  more  attractive  when  we  consider  that  the  SEED  is,  itself,  a  pin  diode, 
allowing  the  possibility  that  the  detector  and  modulator  can  be  made  together  in  the  same 
integrated  process.  Operating  the  diode  within  an  optical  cavity  configuration  will  require  that 
the  diode  to  be  biased,  thus  solving  problems  with  quantum  well  diode  voltage  independence. 
[3]  However,  the  requirement  that  the  modulator  and  detector  photocurrents  are  similar  implies 
that  the  output  is  a  small  fraction  of  the  input,  thus  limiting  cascaded  operation.  Having 
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established  that  we  have  a  good  optical  subtraction  technology,  we  will  go  on  to  investigate  a 
family  of  architectures  based  on  inhibition. 

2.  Simple  lateral  inhibition. 


2.7  Single  stage  inhibition 


A  simple  lateral  inhibition  optical  system  is  shown  in  figure  la  in  a  one-dimensional  format, 
although  this  can  be  extended  to  a  two  dimensional  computing  surface.  Each  node  consists  of  a 
detector-modulator  pair,  and  receives  an  external  input  which  is  incident  on  the  modulator.  A 
portion  of  the  reflected  signal  is  redirected  to  the  detectors  of  neighbouring  nodes  in  some 
manner,  which  defines  the  lateral  interconnection  function  g(x).  The  node  does  not  inhibit 
itself.  The  optical  element  is  not  specified  and  may  be  reflective,  refractive,  diffractive,  or 
holographic.  Thus,  the  reflectivity  of  a  given  modulator  is  dependent  on  the  total  inhibitory 
signal  received  from  neighbours,  and  decreases  linearly  in  response  to  it.  A  point  to  note  is  that 
the  internal  state  of  the  node,  which  we  identify  with  the  modulator  reflectivity,  is  determined 
purely  by  the  neighbours,  and  is  not  dependent  on  the  input  to  that  node,  although  the  final 
output  is.  This  single  pass  geometry  effectively  results  in  a  convolution  of  the  input  with  the 
lateral  interconnection  function.  Figure  lb  shows  the  results  of  a  typical  simulation  with  flat 
lateral  inhibition  g(x)  over  ±3  nodes.  A  periodic  boundary  condition  is  imposed  to  avoid  edge 
effects.  This  network  performs  a  simple  edge-enhancement  function.  The  response  of  the 
system  reflects  the  essentially  linear  nature  of  the  device.  While  still  linear,  the  state  of  the  node 
can  be  made  positively  dependent  on  the  input,  as  well  as  inhibited  by  neighbouring  nodes,  by 


Figure  la  Simple  lateral  inhibition  system 
2.2  Single  stage  inhibition  with  input  subtraction 


Figure  lb  Input,  lateral  inhibition 

function  g(x)  and  the  output. 


From  figure  lb  it  can  be  seen  that  the  result  of  a  single  stage  inhibition  is  basically  a  reduced 
scale  version  of  the  original  input  with  some  small  perturbations  representing  the  processing. 
This  suggests  that  if  a  suitably  scaled  version  of  the  original  input  is  subtracted  from  the  output, 
then  the  processing  can  be  enhanced.  Since  this  involves  subtraction  a  second  self-linearised 
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modulator  detector  pair  can  be  used  as  shown  in  figure  2.  A  fraction  of  the  initial  input  is  taken 
off  and  directed,  on  a  one  to  one  basis,  to  the  detectors  of  a  second  detector/modulator  array. 
The  output  of  the  lateral  inhibition  processing  is  sent  to  the  modulators  of  the  "subtraction" 
array,  effectively  subtracting  the  input  and  giving  an  enhanced,  if  small,  output.  This  could 
form  the  input  to  a  further  processing  stage,  such  as  the  winner- take-all  layer  described  in  the 
next  section. 


Partially 

reflecting 

mirror 


Subtraction  Array 

detector  array  modulator  array 


modulator  array  detector  array 

Inhibition  Array 

Figure  2  Simple  lateral  inhibition  with  input  subtraction. 


3.  Lateral  inhibition  with  feedback 

By  introducing  a  nonlinear,  recurrent  feedback  of  the  output  to  the  input  winner-take-all  (W-T- 
A)  behaviour  can  arise.  This  can  be  done  by  using  a  second  modulator  array  for  input,  as 
shown  in  figure  3a.  The  modulator  array  is  uniformly  illuminated  and  the  external  inputs 
presented.  The  modulated  beams  are  then  directed  to  the  laterally  connected  layer  as  before. 
However,  if  the  output  is  monitored  and  used  to  control  the  input  modulator  array  a  recurrent 
loop  is  established.  All  that  is  required  is  that  the  input  is  presented  for  a  sufficient  time  for  the 
feedback  loop  to  be  established,  thereafter  the  input  is  removed  and  the  system  converges  to  a 
stable  state.  Again,  this  is  a  one-dimensional  example  of  a  possibly  two-dimensional  surface. 

Cheng  &  Wan  have  shown  that  there  must  be  gain,  however  small,  in  the  feedback 
loop;  if  not,  all  signals  will  decay  to  zero.[5]  More  importantly,  from  an  optical  point  of  view, 
they  have  examined  the  situation  of  limited  lateral  inhibition.  Global  interconnection,  leading  to 
a  single  winning  node,  is  not  feasible  in  an  optical  system,  as  finite  optical  power  can  only  be 
distributed  to  a  limited  neighbourhood.  They  have  shown  that  in  a  finite  interconnection 
network  local  winners  will  arise,  one  per  neighbourhood  as  defined  by  the  range  of 
interconnection. [5]  A  point  not  stressed  by  Cheng  &  Wan  is  that  each  cell  must  have  a  winner, 
even  if  the  input  in  that  area  is  perfectly  uniform.  Small  spurious  variations  arise  during  the 
iteration  process,  which  become  amplified,  eventually  giving  a  "winner".  Figure  3b  shows  an 
example  of  an  input  and  the  result  that  the  simulation  converged  upon.  As  can  be  seen,  local 
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maxima  are  located,  but  a  spurious  peak  also  occurs  (around  node  22).  Such  a  system  could 
form  the  heart  of  a  self-organising  network,  when  prefixed  by  a  Hebbian  network. [6] 


external 
input 


optical 

power 


Seif-Linearised 
modulator  array 


Detector 

array 


Figure  3a  Lateral  inhibition  with  feedback 


11  Node  21 
number 


Figure  3b  Simulation  of  local 

winner-take-all  behaviour. 


The  problem  of  spurious  peaks  arising  from  positive  gain  in  the  feedback  could  be 
ameliorated  by  introducing  a  nonlinear  feedback  function.  For  example,  if  a  sigmoidal  gain 
function  were  used,  signals  less  than  the  point  of  inflection  value  would  experience  loss  and  so 
be  damped,  only  signals  larger  than  the  threshold  set  by  the  inflection  value  would  grow. 

A  further  nonlinearity  is  introduced  in  shunting  networks  where  the  lateral  inhibition 
signal  is  modulated  (multiplied)  by  the  value  of  the  node.  [7]  In  the  context  of  this  work  an 
optical  implementation  can  be  envisaged  whereby  the  inhibition  optical  signal  is  reflected  from 
the  node  modulator  before  detection.  This  would  necessitate  a  more  complex  optical 
arrangement.  It  should  be  noted  that  this  would  not  correspond  directly  to  the  conventional 
shunting  network  as,  in  this  case,  the  inhibition  signal  is  being  multiplied  by  the  modulator 
reflectivity  rather  than  the  single  state  variable  generally  used. 
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Abstract:  Based  on  fractional  Fourier  transforms  a  new  optical  architecture  is 
analyzed  for  filtering  both  in  spatial  and  frequency  domains,  and  is  further 
extended  to  adaptive  neural  networks.  Both  error  back-propagation  and  genetic 
algorithm  are  applicable  for  the  training  of  the  neural  networks.  The  developed 
single-layer  neural  network  is  suitable  for  large-scale  optical  implementations, 
and  may  be  utilized  as  a  feature  extractor  or  classifier.  Complicated  multi-layer 
neural  networks  may  incorporate  several  of  these  layers  in  serial  and/or  parallel 
architectures. 


1.  Introduction 

Recently  Fourier  transforms  of  fractional  order  had  been  developed.  [1,2] 
Unlike  the  ordinary  Fourier  transform,  i.e.  the  fractional  Fourier  transform 
with  order  1,  the  fractional  Fourier  transforms  extract  features  which 
combine  both  spatial  and  frequency  characteristics  of  the  original  images. 
One  may  easily  expect  that  the  fractional  Fourier  transformed  image  is 
something  between  the  original  image  and  full  Fourier  transformed  image. 
Also  simple  optical  architectures  with  only  single  lens  or  double  lenses  with 
proper  spacing  were  reported.  [2] 

It  is  well  known  that  a  Vander  Lugt  correlator  can  perform  2-D 
correlation  and  shift-invariant  filtering  based  on  Fourier  transformed 
frequency  information.  On  the  other  hand  neural  networks  has  been  widely 
utilized  for  position -dependent  classification  and  associative  memory. [3]  The 
shift- invariant  correlation  may  be  understood  as  characteristics  of  feed¬ 
forward  neural  networks  with  Toeplitz  interconnection  matrix,  which  shares 
appropriate  interconnection  weights.  However  this  weight- sharing  greatly 
reduces  network  complexity  to  result  in  limited  storage  capacity.  Instead 
several  neural  network  models  incorporate  local  shared- interconnections  to 
provide  local  shift -invariance  with  good  classification  performance.  [4 -6] 

In  this  paper  we  extend  the  Vander  Lugt  correlator  to  incorporate  both 
the  shift- invariant  (frequency)  and  position -dependent  filtering  based  on  2 
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fractional  Fourier  transfoims,  and  develope  analogy  between  this  architecture 
and  neural  networks.  By  virtue  of  the  simple  optical  architectures  for  the 
fractional  Fourier  transforms,  the  neural  networks  is  easy  to  implement  in 
large-scale. 

2.  Optical  filters  based  on  fractional  Fourier  transforms 

We  substitute  the  ordinary  Fourier  transforms  in  the  4-f  Vander  Lugt  filter 
into  fractional  Fourier  transforms  with  orders  pi  and  p2.  Following  Ref.2, 
we  adopt  an  explicit  integral  formula  for  the  fractional  Fourier  transform,  i.e. 


Up{k)  =  iuix)) 

=  Ju(a:)  exp[-yr^^(x^A:")]  exp[-  x/sL 

and  simple  one  thin -lens  architecture  with  lens  focal  length  /  =/o/  sin^  and 
distances  Z  =  /«  tan  ( 0/2),  where  0  -  p  ^/2,  \  is  the  wavelength,  and  fo  is 
an  arbitrary  constant.  The  infinite  domain  of  the  integral  is  understood  and 
suppressed  throughout  this  paper.  Within  this  approximation  the  input- 
output  mapping  relationship  of  the  pi-p2  filter  is  now  derived  as 


Up^k)H(k)  exp 


X/otan02 


(z^  +  Zc^)]  exp[ 


X/oSin0: 


2k]  dk  , 


=  I  aix)  hit)  exp[^-( 


2  .2 

z  -t 


)]  dx  , 


where  u(x)  and  v(2)  are  input  and  output,  respectively,  and  is  the 
fractional  Fourier  transform  with  order  pf  ~  2  0//jt  of  the  filter  H(k)  located 
between  the  two  fractional  Fourier  transfonns.  The  0/  and  t  satisfy 


tan0i  ^  tan02  tanO/  sin0]  ^  sin02  sin0/ 

with  01  =  piV2  and  02  =  P2V2.[8] 

It  is  worth  noticing  to  look  at  several  special  cases.  If  both  pi  and  p2 
are  1  for  the  Vander  Lugt  filter,  0/  becomes  ^/2  and  t  -  x+z.  Now  Eq.(2) 
becomes  correlation  integral.  When  pi  and  p2  satisfy  pi+p2  =  2  (01  +  02  =  ^), 
0/  again  becomes  7[/2  and  one  obtains 

=  J  h(^^)  dx  ,  (4) 

where  the  fi.)  is  now  ordinary  Fourier  transform  of  the  filter  H(k).  The 
subscript  01  emphasizes  that  v(2)  is  function  of  Ot.  Compared  with  the 
Vander  Lugt  filter,  Eq.(4)  now  has  scaling  factor  (  sin0i)  in  the  filter 
function.  This  scaling  factor  and  exponential  modulation  term  destroy  shift- 
invariance,  and  provide  position  -  dependent  ciassificastion  with  slight  local 
shift  invariance.  Feature  extraction  and/or  classification  of  characters,  images, 
and  speech  signals  may  be  useful  applications  of  this  optical  filter.  In 
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addtion  to  the  filter  in  Eq.(4),  it  may  be  possible  to  obtain  ambiguity 
function  and  Wigner  distribution  function. [8] 

3.  Neural  Networks  based  on  Fractional  Fourier  Transforms 

In  Fig.l  the  neural  network  analogy  ^tn  ^  ^2) 

of  the  fractional  Fourier  transform  —  - 

and  the  spatial/frequency  filters  are  , 

shown,  where  the  Un  and  vi  are 
input  and  output,  respectively,  and 
Um  and  Vm  are  corresponding 

fractional  Fourier  and  inverse 

transforms.  The  fractional  Fourier  ^  5 

transform  operations  are  now  substi¬ 
tuted  by  2  complex  synaptic  weights,  Fig.l  Neural  network  analogy 

Wiin  and  Wjm,  and  linear  summa¬ 
tion  at  Um  and  vi.  For  classification  problems  additional  Sigmoid  function 
S(.)  may  be  used  at  the  output.  The  Hm's  are  filter  transmissions  between 
the  two  fractional  Fourier  transforms  and  adaptively  trainable. 

In  mathematical  notations  one  obtains 


Um  =  T.Wl^U„  , 


Vm  =HmUr 


yi  =S(|Uil^), 


where  from  Eq.(l)  the  fixed  global  interconnections  become 


W\m  =  exp  [ 


Votan^: 


imhe-^nhx') 


A/cAx  mn], 


Im]. 


X/„Sn«2 

The  Ax,  Ar,  and  A/c  are  distances  between  pixels,  and  designed  to  be  equal. 
It  is  worth  noting  that  both  the  and  Wim  are  symmetric. 

This  architecture  with  fixed  global  synapses  and  adaptive  local  control 
gains  is  similar  to  the  TAG  (Training  by  Adaptive  Gain)  model  [9],  and  the 
popular  error  back-propagation  learning  algorithm  is  still  applicable.  By 
defining  a  total  output  error  as 

E=  (8) 

^  s  / 

for  all  stored  input-output  pairs  (u%f)  one  may  train  the  neural  networks  by 
error  back-propagation  [8]  or  genetic  algorithm.  Not  only  the  filter  function 
Hm's  but  also  the  order  of  fractional  Fourier  transform  0i  may  be  adapted. 
For  the  latter  case  with  error  back-propagation  learning  algorithm  the 
gradient  may  be  calculated  as 
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The  adaptation  of  the  fractional  order  is  essential  for  proper  feature 
extraction  in  multi-layer  classifier  network,  where  the  first  layer  does  feature 
extraction  and  subsequent  layers  do  classification. 

The  single-layer  neural  networks  in  Fig.l  may  be  implemented  by  2 
thin  lenses  and  one  spatial  light  modulator  (SLM),  which  is  much  simpler 
than  existing  optical  architectures  utlizing  multifacet  holograms,  lenslet 
arrays  with  SLM,  or  volume  holograms.  Error  back- propagation  also 
requires  one  thin  lens.  Multi-layer  neural  networks  consist  of  several  these 
modules.  In  this  case  the  first  layer  may  emphasize  local  features  by 
assigning  higher  order  {pi  ^  1)  for  the  fractional  Founer  transfoim,  while 
the  orders  may  become  smaller  for  position -dependent  classification  at  the 
latter  layers.  The  whole  networks  may  be  adapted  by  error  back- 

propagation  or  genetic  algorithm. 

One  problem  might  arise  at  practical  optical  implementations.  Even  for 
symmetric  real  function  iix),  its  fractional  Fourier  transform  H(k)  becomes 
complex  function.  Performance  of  phase-only  filter  may  need  to  be 
investigated  in  this  case. 

For  some  cases  including  multiple  feature  extractions,  the  ambiguity 
function  and  Wigner  distribution  function  cases,  one  would  like  to  have 
fractional  Fourier  transforms  with  many  different  orders  simultaneously. 
These  can  be  done  by  lenslet  arrays  instead  of  single  lens.  To  use  same 
distance  Z,  the  focal  length  of  each  small  lens  in  the  lenslets  should  be 
/=/o/sin$  =Z/{  tan  (4i/2)sin4)). 

4.  Conclusion 

In  this  paper  we  present  a  new  optical  architecture  for  both  spatial  and 
frequency  filtering  and  also  an  adaptive  neural  networks  module  based  on 
fractional  Fourier  transforms.  Complicated  multi-layer  neural  networks  may 
consist  of  several  modules  in  serial  and/or  parallel  architectures.  With  this 
simple  architecture  large-scale  optical  implementation  of  adaptive  neural 
networks  becomes  feasible. 
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Abstract.  We  present  character  recognition  results  obtained  with  an  self¬ 
organizing  map  neural  network.  The  optical  experimental  system  is  built 
around  two  Ferroelectric  Liquid  Crystal  Bistable  Optically  Addressed  Spatial 
Light  Modulators  (FLC-BOASLMs)  in  a  resonator  configuration. 


1.  Network  Model 


The  network  we  implement  is  based  on  Kohonen’s  self-organizing  map  [1]  where  the 
spatial  neighbourhood  of  the  active  neurons  is  taken  into  account  during  the  learning 
process  to  produce  a  topological  organization  of  the  activity:  similar  inputs  produce 
similar  map-layer  activities.  Two  major  modifications  are  made  to  the  original  Kohonen 
model  to  adapt  it  to  our  optical  implementation. 

The  first  concerns  the  neural  decision  where  we  replace  the  Winner- Take-All  (WTA) 
decision  by  a  global  neural-map  layer  thresholding.  This  is  because  the  WTA  is  a  non¬ 
local  nonlinearity  which  is  difficult  to  perform  with  our  BOASLMs.  The  WTA  function 
can  be  realized  with  silicon-backplane  SLMs  [2]  but  our  computer  simulations  (section 
3)  show  that  the  simpler  global  thresholding  can  nevertheless  lead  to  a  system  which 
learns  and  classifies  as  required,  albeit  with  an  expected  reduction  in  capacity. 

The  second  modification  concerns  the  weights  renormalization,  again  made  diffi¬ 
cult  by  the  use  of  BOASLMs.  The  BOASLMs  do,  however,  permit  switching  in  both 
directions:  to  opaque  and  transmissive  states.  We  use  this  capacity  to  reinforce  ac¬ 
tive  connections  and  weaken  inactive  ones,  a  procedure  which,  when  correctly  balanced, 
prevents  weight  saturation.  This  technique  has  also  been  confirmed  in  simulations. 

With  these  modifications  for  a  pixelized,  binary,  unipolar  input  pattern,  Xi^  {i,j  = 
1  •  •  •  n),  an  array  of  weight  maps  a  map-layer  neural  activity,  {T},  and  a  map- 

layer  neural  firing,  {E^},  the  network  algorithm  becomes: 


n  n 


{>'■}  =  EE-v.lw'-}., 

i=i j=i 


(1) 
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(t  + 1)  =  {14^},,  (^)  +  fiX,  {r'}  -  {r}  (2) 

where  {}  indicates  a  quasi-coiitiriuous  quantity  (the  BOASLMs  are  non-pixelized 
devices),  is  the  complement  of  A',  /z,  and  //  are  learning  rate  coefficients  and  /  is  a 
combined  thresholding  neighbourhood  operator. 


2.  Experimental  system 


The  experimental  system,  described  in  detail  in  [3],  is  centred  around  two  BOASLMs.  If 
such  a  modulator  is  illuminated  with  a  given  pattern  and  an  electrical  pulse  is  applied  to 


the  electrodes,  the  pattern  can  be  binarized  and  stored  on  the  modulator  for  subsequent 
rereading.  One  modulator  is  used  in  this  way  to  perform  the  thresholding  of  the  neural 
activity.  Although  the  BOASLMs  are  intrinsically  bistable,  by  operating  them  near  the 
threshold  voltage  and  using  spatial  integration  techniques  they  can  be  made  to  show 
a  grey-level  behaviour  [4].  A  sc’cond  modulator  is  used  in  this  manner  to  store  and 
update  the  synaptic  weights.  The  sensitivity  of  the  modulators  can  be  varied  by  varying 
the  height  and  duration  of  the  electrical  pulse.  We  use  these  parameters  to  change  the 
threshold  level  and  the  learning  rate. 


Figure  1.  (a)  Creation  of  the  neural  activity.  (b)  Weight  updating 

Tlic  basic  operation  of  the  system  is  shown  above.  The  input  pattern  is  presented 
to  the  system  with  an  electrically  addressed  SLM  built  by  CRL.  Ihe  SLM  is  a  128x128 
pixel  FLO  device,  we  group  the  pixels  12x12  to  form  one  input  pixel.  The  input  pattern  is 
imaged  onto  th('  array  of  weight  maps  (first  ROASLM)  thus  activating  the  corresponding 
weight  maps.  An  ojDtical  crossbar,  consisting  of  a.  lenslet  array  and  a  collimating  lens, 
then  images  and  superimposes  the  activated  weight  maps  onto  the  second  (neural  map) 
BOASLM  thus  realizing  the  required  sum  (ec|.la).  The  resulting  neural  activity  is  bina¬ 
rized  and  stored  on  the  second  BOASLM.  The  neighbourhood  function  is  introduced  by 
varying  the  lieight  and  duration  o(  the  voltage  pulse  sent  to  the  BO.ASLMs  to  dilate  the 
neural  fii'ing  pattern. 

This  firing  is  r('ad  witli  a  |4an('  wave,  duplicated  with  a  Dammaiin  grating  and 
fed  back,  through  tlu'  input  image,  to  the  weight  map  B0.4SLM  where  it  is  added  to 
the  active  weight  maps  and  sulhracted  from  the  inactive  ones.  The  eflect  is  to  reinforce 
the  connections  l)('tween  acti\e  input  neurons  and  active  zones  m  the  neural  map,  while 
ai  the  sanu'  time  w<nvk(Mhng  non-iu'odnct  i\'e  (onnections.  In  this  way  a  Ilebbian-type 
learning  rule  is  implementr'd. 
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The  whole  system  is  synchronized  and  the  electrical  pulse  parameters  varied  with 
a  personal  computer  which  otherwise  plays  no  part  in  the  implementation.  All  neural 
calcula.tions,  thresholding,  weight  updating  etc.  are  performed  in  parallel  in  optics. 


3.  Computer  simulations 

To  confirm  the  feasibility  of  the  implementation  we  have  performed  some  computer  sim¬ 
ulations  of  the  algorithm  and  the  optical  system.  The  simulations  take  some  of  the 
limitations  of  the  optical  system  into  account,  in  particular  instead  of  weight  renor¬ 
malization  they  use  the  reinforcing/weakening  procedure  described  above  and  a  simple 
global  threshold  instead  of  the  WTA  decision.  In  addition  the  neighbourhood  function 
is  a  simple  dilatation.  Three  input  classes  were  used:  8x8  pixel  images  of  the  characters 
F,I,L  with  three  examples  per  class. 

Figures  2  and  3  show  some  results  of  the  simulations.  In  figure  2  from  left  to 
right  are  the  input  image,  the  weight  map  after  learning,  the  map-layer  activity  (before 
thresholding)  and  the  map-layer  firing  (after  thresholding  and  dilatation).  Figure  3  shows 
the  inputs  and  corresponding  firings  for  the  other  input  images. 


Figure  2.  Results  of  computer  simulations 


Figure  3.  Simulated  map-layer  firings  for  different  inputs. 

We  can  see  that  the  neural-map  layer  activity  has  been  spatially  organized  by  the 
learning  process;  different  zones  of  the  map  being  active  for  the  different  input  classes. 
The  learning  also  generalizes  giving  essentially  the  same  output  for  each  example  in  a 
class. 

4.  Experimental  results 

The  first  step  in  the  experimental  verification  consisted  in  initializing  the  array  of  weight 
maps  with  an  post-learning  array  obtained  from  the  simulations  and  using  it  to  check 
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the  capacity  of  the  experimental  system  to  recognize  the  different  inputs  correctly.  The 
initialization  was  performed  by  sending  the  array  to  the  EASLM  (using  its  full  resolution) 
and  imaging  it  onto  the  weight  map  BOASLM.  The  activities  produced  on  the  neural- 
map  BOASLM  for  the  different  inputs  classes  appear  in  figure  4,  from  left  to  right  the 
firings  for  inputs  F,I  and  L.  We  can  see  a  clear  correspondance  with  the  results  of  the 
simulations  (figure  3). 


Figure  4.  Experimental  map-layer  activities  for  different  inputs. 


The  second  more  difficult  step  in  our  experiments  consists  in  the  implementation 
of  all  optical  learning.  Our  first  results  in  this  direction  appear  in  figure  5  where  we  see 
the  initial  and  hnal  weight  maps  and  the  activities  for  two  different  input  images.  The 
weight  map  has  effectively  been  modified  and  differing  activities  produced. 


5.  Conclusion 

We  have  demonstrated  the  capacity  of  our  system  to  recognize  input  characters  and 
shown  evidence  of  a  learning  ca.pabilit}'.  Improvements  could  be  made  by  using  a  moie 
complex  decision  stage,  (for  example  with  a.  smart  OASLM)  which  appears  to  be  the 
major  limitation  at  present. 
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Abstract 

We  experimentally  analyze  the  origin  of  cross-talk  in  a  photorefractive  memory  in  which  images 
are  multiplexed  with  the  deterministic  phase  encoding  technique.  We  demonstrate  a  simple 
method  to  efficiently  reduce  most  of  this  cross-talk. 


1.  Introduction 

All  the  methods  used  to  multiplex  holograms  in  photorefractive  crystals,  as  angular, 
frequency  or  phase  encoding  multiplexings,  suffer  from  one  common  source  of  cross-talk  due 
to  the  energy  diffracted  from  non-Bragg-matched  gratings.  This  source  of  noise  is  not  very 
restrictive  and  allows  large  storage  capacities  [1].  Nevertheless,  the  phase  encoding 
multiplexing  scheme  presents  other  sources  of  cross-talk,  originating  from  setup 
imperfections,  that  may  severely  limit  the  storage  capacity. 

In  this  technique,  each  image  is  recorded  by  interfering  the  image  beam  with  N 
overlapping  reference  beams.  The  set  of  relative  phases  between  the  reference  beams  is  the 
image  code.  One  may  construct  N  sets  of  orthogonal  phase  codes  so  that  N  images  can  be 
stored  in  the  material.  The  reading-out  of  a  stored  image  is  realized  by  illuminating  the  crystal 
with  all  N  reference  beams  with  the  set  of  phases  used  to  record  that  image.  We  use  binary 
(0,;r)  orthogonal  phase  codes  (Hadamard  codes).  Theoretically,  noiseless  reconstructions 
appear  because  the  reconstruction  of  undesired  images  interferes  destructively.  Nevertheless, 
if  the  phase  shifts  slightly  differ  from  (0,  k)  and/or  if  the  diffracted  reference  beam  amplitudes 
are  not  the  same  then  the  sets  of  phase  codes  are  not  perfectly  orthogonal.  That  induces  noise. 
We  previously  [2]  derived  the  expressions  for  the  amplitude  signal  of  the  image,  and 
for  the  amplitude  noise  on  this  image  due  to  cross-talk  with  other  recorded  images.  They 
are  proportional  to  : 

N  M  N  ^ 

n=l  m=\,m*p  n=l 

is  the  phase  of  the  reference  beam  during  the  recording  of  the  m'^'  image,  is  the 
diffracted  amplitude  from  the  reference  beam  onto  the  grating  corresponding  to  the  m"' 
image.  A„,  is  the  amplitude  of  the  binary  recorded  image,  A^=(0,1)  so  that 
5^  a  (0,A). 
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One  sees  on  this  formula  that  noise  may  appear  either  if  ^  or  if 

Hereafter,  we  experimentally  investigate  the  influence  of  such  imperfections  on  noise.  A 
technique  to  decrease  this  cross-talk  will  be  demonstrated. 


2.  Experimental  arrangement 

Our  set-up  was  described  elsewhere  [3].  With  64  phase  coded  reference  beams  we  can  store 
64  images  in  a  BaTiO^  sample.  A  dynamic  8x8  phase  modulator  allows  a  fast  retrieval  of  all 
images.  In  that  set-up  the  uniformity  of  and  the  accuracy  of  the  phase  modulator  insure  a 
large  signal  to  noise  ratio.  We  measured  it  to  be,  in  intensity,  SNR^^  >140. 

In  order  to  investigate  the  origin  of  noise  in  that  kind  of  multiplexing,  we  intentionally 
misaligned  our  set-up.  First  a  disturbance  of  the  pixellated  phase  modulator  produces  a  phase 
shift  of  ;r  +  e  (instead  of  k),  with  £  a  systematic  phase  error.  A  random  phase  error  with  a 
zero  mean  value  was  indeed  proven  to  negligibly  contribute  to  the  cross-talk.  Second,  we 
introduced  an  amplitude  noise  by  rotating  the  photorefractive  crystal  and  working 

in  an  angular  range  where  the  diffraction  efficiency  greatly  varies  with  the  incident  angle  of 
the  reference  beam. 

To  measure  these  diffracted  amplitude  modulations,  we  modified  the  phase  modulator 
and  used  it  as  an  amplitude  modulator.  Each  reference  beam  could  be  switched  on  while  all 
other  ones  were  off.  We  recorded  N  holograms  (one  for  each  reference  beam)  of  an  entirely 
white  image.  We  then  measured  the  diffracted  signal  when  reading  out  the  hologram  with 
each  reference  beam.  We  thus  determined  •  A  representation  of  these  measurements  is 
plotted  in  figure  1  .a.  The  reference  beams  are  arranged  in  a  square  matrix.  The  6  direction  is 
in  the  incident  plane,  the  0  is  the  perpendicular  direction.  We  see  that  we  have  a  random 
noise,  =  25%,  plus  slow  variations  in  the  0  and  0  directions. 


a)  b) 

Figure  1  :  a)  is  a  3D  plot  of  the  64  values  versus  the  two  angles  0  and  0 ;  b)  represents 

two  noisy  codes. 
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These  fluctuations  come  from:  first  the  fan-out  grating,  which  transforms  the  unique 
reference  beam  into  64  reference  beams  (the  intensity  variation  between  the  reference  beams  is 
lower  than  cr  ^  4%);  second  the  phase  modulator  induces  a  fluctuation  lower  than  2%;  third, 
the  main  error  is  due  to  the  variation  of  angles  between  the  different  reference  beams  and  the 
image  beam  in  the  crystal,  it  explains  the  dependence  in  the  9  direction. 


3.  Phase  error 


Because  the  phase  noise  is  a  systematic  noise,  we  can  easily  derive  analytical  expressions 
from  formulae  (1)  in  case  We  find  the  same  expression  for  the  cross  talk  in  all 

images  except  for  the  image  recorded  with  the  uniform  phase  code 
=  image  recorded  with  the  uniform  code  is  the  only  one  to 


contribute  to  the  noise  at  the  first  order  in  £.  From  formulae  (1),  we  find: 

N  ^ 

5,  a  —  (1- cose -h; sine) 

2  m=2 


(2) 


and  for  pi^\. 


M 

(l-cose)(  ]^A^) +  (1- cose-; sin e)A, 


(3) 


4.  Computer  simulations  and  amplitude  noise 

Because  of  the  random  distribution  of  it  is  impossible  to  derive  an  analytical  formula  for 
that  cross-talk.  Therefore,  we  computed  values  for  formulae  (1). 

First  of  all,  we  checked  that  a  phase  only  error  (with  e  «  1)  produces  uniform  cross¬ 
talks  between  all  images,  except  for  the  one  recorded  by  the  uniform  code.  These  results  are 
in  agreement  with  formulae  (2)  and  (3).  Then  we  introduced  in  the  simulation  all  the  measured 
values  for  the  noise  increases.  We  found  it  to  be  now  different  for  each  image.  The 
noisiest  image  (largest  for  p^\)  depends  on  the  spatial  distribution  of  the  fluctuation  on 
For  instance  with  the  distribution  shown  in  figure  l.a,  the  noisiest  images  are  the  ones 
recorded  with  the  uniform  code  and  with  the  codes  shown  in  figure  l.b). 

We  experimentally  checked  that  point.  We  inserted  a  neutral  density  filter  on  the  right 
half  of  the  reference  beam  matrix  to  partly  compensate  for  the  dependence  of  with  0. 
Consequently  we  measured  a  reduction  of  the  noise  on  the  image  recorded  with  the  first  code 
shown  in  figure  lb.  We  can  say  that  the  appearance  of  noise  in  some  images  corresponds  to 
the  similitude  of  the  code  geometry  with  the  geometry  of  variations  of  the  (see  figure  1). 


5.  Technique  for  cross-talk  reduction 

We  propose  two  step  technique  to  reduce  cross-talk  between  images.  They  are  derived  from 
the  observation  of  formulae  (2,3). 

The  first  idea  is  to  set  the  amplitude  of  the  first  image  A,  =  0.  The  computer  simulation 
showed  this  is  efficient  to  reduce  cross-talk  even  in  case  We  experimentally 

recorded  first  a  set  of  images  with  the  first  image  being  white.  A,  =  1,  and  second  a  set  of 
images  with  this  first  image  being  black,  Aj  =0.  For  instance,  for  £  =  2A0'^rad,  we 
measured  an  intensity  signal  to  noise  ratio  divided  by  a  factor  2. 


418 


The  second  idea  to  decrease  the  cross-talk  is  to  discard  the  term  formulae 

(2,3)  while  keeping  A,  =  0.  The  addition  of  a  supplementary  uniform  phase  modulator  on  the 
image  arm  was  previously  proposed  [2].  A  ^  =  0  or  n  phase  shift  is  randomly  superposed 

to  the  image  amplitude  during  the  recording  process.  On  the  average  becomes 

close  to  zero.  The  computer  simulations  showed  that  this  technique  is  also  efficient  when  we 
have  both  e  and  We  experimentally  verified  that  technique  by  inserting  a 

uniform  phase  modulator  on  the  reference  arm.  For  instance  for  e  ~  3  \0~^rad  and  without 
that  technique  we  found  an  intensity  signal  to  noise  ratio  (averaged  on  all  images  except  the 
noisiest)  larger  than  100;  for  the  image  recorded  with  the  uniform  code  we  found  3  only;  and 
for  the  images  recorded  with  the  codes  shown  in  figure  l.b  we  got  7.  By  applying  that 
technique  we  measured  an  intensity  signal  to  noise  ratio  for  all  images  larger  than  or  about 
150.  There  is  no  more  noise  on  particular  images. 


6.  Conclusion 

We  recorded  64  images  in  our  set-up  with  low  cross-talk  between  images.  The  sources  of 
cross-talk  have  been  increased  by  misaligning  the  set-up  in  order  to  make  the  analysis  easier. 
We  demonstrated  a  simple  and  very  efficient  solution  to  reduce  that  cross-talk  and  therefore  to 
enhance  the  memory  capacity. 
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Abstract.  Some  "self  correcting"  learning  laws  are  compared  with  photorefractive  dynamics. 
Possible  optical  implementations  of  corresponding  neural  basic  units  with  photorefractive 
crystals  are  proposed  and  the  key  characteristics,  required  for  self  organizing  network 
realization,  are  experimentally  demonstrated.  An  optoelectronic  network  architecture  based  on 
presented  above  units  and  on  a  self-imaging  feedback  system  is  discussed. 


1.  Introduction 

Neural  self-correcting  learning  algorithms  allow  feature  extraction  in  a  stochastic  set  of  input 
signals  and  selectively  self-adapt  to  provide  maximal  activation  for  signals  exhibiting  the  same 
features  [1].  The  learning  is  conducted  by  comparing  the  input  signal  with  the  weight  vector  in 
which  are  stored  the  common  features  extracted  by  the  algorithm  during  the  learning  of 
previously  processed  signals.  The  mismatch  is  considered  as  an  effective  error.  The  learning 
process  of  the  current  input  then  results  in  a  correction  of  this  error  by  updating  the  weight 
vector  components  proportionally  to  that  mismatch.  So,  the  algorithm  itself  finds  the 
classification  code  from  the  real  time  inputs.  Digital  simulations  are  mostly  used  to  get  such 
kind  of  algorithms  and  only  the  simplest  versions  of  algorithms  are  implemented  because  of 
the  complexity  of  the  others. 

However,  the  fundamental  dynamics  of  hologram  creation  in  some  nonlinear  optical 
materials  appears  to  be  perfectly  fitted  to  implement  these  algorithms  [2,3].  In  this  work  we 
show,  theoretically  and  experimentally,  that  different  learning  algorithms  can  be  implemented 
by  taking  advantage  of  the  kinetics  of  photorefractive  holograms  [2]. 


2.  Theoretical  model 

Consider  a  neural  unit  performing  weighted  summation  on  the  components  of  each  input 

vector  S  =  (Si . S„)  with  its  weight  vector  W  =  (Wj . W„).  The  neuron  output  ri  is  a 

function  f  of  the  scalar  product  W.S.  During  the  training  period,  a  set  of  input  vectors 
is  presented  to  the  neuron  and  W  is  adapted  as  a  function  of  ri-  Some  of  the  most 
interesting  self-adaptation  laws  are  described  by  the  vectorial  equation  [1]: 
dw/dt  =  ari'*  S  -  bn'’  W  ( 1 ) 

where  t  is  the  time,  a  and  b  are  positive  constants,  parameters  p  and  q  are  positive  integers 
defining  the  learning  law.  The  potentialities  and  performances  of  the  neuron  considerably 
differ  according  to  the  values  of  p  and  q,  p  =  0,1,2,  q  =  0,1  [1].  For  instance,  this  adaptation 

law  makes  W  become  proportional  to  the  "moving  average  of  S "  for  p  =  q  =  o  (it  does  not 
show  any  interesting  self  organizing  characteristics)  [1].  On  the  contrary,  the  choice  of 
parameters  as  p  =  1,  q  =  1  or  p  =  2,  q  =  1  produces  interesting  adaptive  behaviors.  These 
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adaptation  laws  show  such  key  characteristics  as  Competitive  learning  and  Feature  extraction. 
Namely,  W  converges  towards  one  of  learned  vectors  when,  say,  two  orthogonal  vectors 
were  presented  at  the  input  (see  computer  simulation  results  at  fig.l ,  e.g.,  for  p  =  1,  q  =  1 ,  f  is 
linear,  and  a=b=l).  It  converges  towards  the  vector  which  represents  the  principal 
characteristics  of  the  group  of  winning  images  when  different  orthogonal  groups  (composed 

by  correlated  images)  are  presented  to  learn.  This  is  W  =  (0.5,  0,  0,  0,  0,  1,  1,  1,  0.5,  0) 
or  W  =  (0,  0.5,  1,  1,  1,  0,  0,  0,  0,  0.5)  for  the  above  learning  law  and  the  inputs: 

=  (0,1,1, 1,1, 0,0, 0,0, 0);S^  =  (0,0, 0,0, 0,1,1, 1,1,0)  (2) 

S"  =  (0,0,1,1,1,0,0,0,0,1);S^  =  (1,0,0,0.0.1,1, 1,0,0)  . 

A  traditional  representation  of  an  adaptive  neuron  and  its  holographic  analog  are 
sketched  in  fig.  2.a,b.  During  the  recording,  a  reference  beam,  whose  electric  field  amplitude 
is  Ar,  interferes  with  n  signal  beams  whose  amplitudes  are  (Agi,...,  AgJ.  n  holograms  with 
index  modulations  6n,  are  thus  recorded  in  the  crystal. 


During  each  readout  (the  reference  beam  is  closed),  all  corresponding  diffracted  beams  add 
coherently  and,  assuming  a  small  diffraction  efficiency,  the  diffracted  amplitude  A^  is 

proportional  to  the  weighted  summation:  a  =  £6n,.Asi  =  £5ni.|AsJ.  Thus,  if  (Agi . AgJ 

are  made  proportional  to  the  neuron  input  vector  components  (S, . Sj,  A^  will  represent 

the  neuron  output  q  where  the  index  modulations  6n,  are  proportional  to  the  components  of 
the  weight  vector.  In  order  to  implement  an  adaptive  neuron,  these  index  modulations  must  be 
adapted  according  to  the  value  of  r\  by  using  a  feedback  loop  (see  eq. I). 

For  the  geometry  depicted  in  fig.  2.b,  n  space-charge  electric  fields  E,  and  corresponding 
holographic  611^  gratings  are  recorded  in  the  photorefractive  material  [4].  Assuming  a 
diffusion  process,  the  kinetics  of  these  E,  is  governed  by  the  following  n  equations: 


dEi  _ 

['2As,  .A„  E,1 

dt 

v,l  1.  y  ) 

(3) 

with 

I,  =A„.A„'+iAs,.As; 

j  =  l 

E^^.  i  is  the  maximum  steady  state  space  charge  electric  field  that  can  be  induced  in  the  material. 
Toi  is  the  photorefractive  time  constant  for  unit  illumination.  Both  parameters  E^^,  ;  and  lo.i 
depend  on  the  grating  wave  vector  induced  by  the  interferences  between  and  Ar. 
Nevertheless,  we  assume  that  the  whole  set  of  signal  beams  lies  in  an  angle  small  enough  for 
the  variations  of  these  parameters  to  be  negligible.  Therefore,  effective  electrooptic  coefficients 
will  not  depend  on  the  signal  beam  orientation,  i.e.  of  parameter  i,  and  equation  (3)  can  be  re¬ 
written  by  changing  E;  into  5n(. 

Similarly  to  computer  simulations,  we  update  the  neuron  weights  by  a  discrete  time  method: 
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i)  For  each  input  vector  S  the  diffracted  amplitude  (with  the  reference  beam  switched  off) 
provides  the  neuron  output  ri . 

ii)  The  weights  are  updated  by  re-enforcing  the  gratings  during  a  time  period  At  with  the 
reference  beam  switched  on.  Both  amplitude  ||Ar||  of  the  reference  beam  and  time  period  At 
are  adjusted  according  to  the  neuron  output  and  the  desired  adaptive  law.  Changes  AE,  are: 


ae,=^^^e. 


At 


-^Ei  At. 


(4) 


From  the  comparison  of  equations  (1)  and  (4)  we  derive  the  following  adaptation  rules: 

a)  Case  p  =  q  =  o :  This  trivial  case  is  achieved  by  setting  to  a  constant  value  both  \\A^\ 


and 


At  whatever  the  output  result  ri . 

b)  Case  p  =  1,  q  =  1 :  This  case  is  easily  implemented  by  setting  ||Ar||  to  a  constant  value  but  by 
making  the  recording  time  At  proportional  to  q.  There  is  now  a  total  correspondence  between 
the  updating  law  and  the  kinetics  of  the  photorefractive  effect. 

c)  Case  p  =  2,  q  =  1 :  To  implement  this  case,  we  adjust  the  different  beam  intensities  so  that 
the  reference  beam  intensity  is  much  larger  than  the  sum  of  the  signal  beam  intensities.  Now, 
At  is  a  constant  but  ||Ar||  is  made  proportional  to  q. 


3.  Experimental  realization 

The  scheme  of  the  experimental  implementation  of  a  photorefractive  neural  basic  unit  is 
presented  in  fig.3  [2].  The  initial  plane  wave  from  a  CW  Ar  laser  at  514  nm  is  split  into  the 
reference  R  and  the  signal  S  arms  by  beam  splitter  BS.  The  signal  beam  is  modulated  by  a 
liquid  crystal  television  screen,  SLM,  (model  XVIOOZM  from  Sh^)  and  then  exploded  in  a 
multitude  of  spherical  beamlets  by  two  crossed  grids  of  cylindrical  microlenses  ML.  The 
image  focal  plane  of  these  microlenses  is  set  in  the  object  focal  plane  of  lens  L.  The 
photorefractive  crystal  (BaTiOs)  is  set  in  the  image  focal  plane  of  L.  Thus  each  beamlet  from  a 
microlens  is  transformed  into  a  plane  wave  so  that  all  beams  are  superposed  in  the  crystal.  The 
set  up  is  driven  by  a  personal  computer  PC.  It  is  used  to  impress  the  input  vectors  (images)  on 
the  SLM,  to  digitize  the  signal  of  diffraction  detected  by  the  photodiode  and  to  control  the 
optical  shutter  SH  disposed  on  the  reference  arm. 

For  the  case  p  =  q  =  0,  both  the  recording  time  period  and  amplitude  of  reference  beam  are 
made  constant  and  thus  independent  of  output  q.  Two  orthogonal  vectors 
=(1,0, 1,0, 1,0, 1.0, 1,0, 1,0, 1.0. 1.0)  and  =  (0,1, 0,1, 0,1, 0,1, 0,1, 0,1, 0,1, 0,1)  are  randomly 
presented  on  the  input  SLM.  As  expected  for  this  learning  law,  no  selective  learning  (selective 
adaptation)  exists  during  experiment.  We  obtain  a  temporarily  averaged  hologram  recording. 

The  recording  time  period  is  then  made  proportional  to  q  (i.e.  to  the  diffracted 
amplitude  or  to  diffracted  intensity  according  to  the  chosen  f )  to  implement  the  case  p  =  q  =  1 


a)  Competitive  learning:  The  dynamic  behavior  for  the  same  input  vectors  can  be  seen  in  fig.4 
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where  the  output  of  neuron  is  presented  versus  the  number  of  processing  steps,  f  is  linear. 
There  is  very  weak  initial  diffusion  (with  approximately  equal  intensity)  towards  the 
photodiode  (neuron  output)  for  both  input  vectors.  After  some  period  of  initial  prwessing  the 
randomly  accumulated  fluctuations  break  the  symmetry  and  one  of  these  vectors,  •5' ,  becomes 
the  winner.  With  further  processing  the  weight  vector  formation  (and  respectively  the  output) 
of  neuron  reaches  its  saturation  level.  We  receive  the  following  picture  of  normalized  weight 
components  trying  to  readout  the  neuron  weights  with  separate  vector  elements  : 

W'  =(0.7, 0.2, 0.8, 0.1, 0.7.0,2, 0.8, 0.1, 1.0, 0.1, 0.9, 0.1, 0.8, 0.1, 0.7, 0.2) 

The  weight  vector  is  thus  oriented  parallel  to  input  vector  S\  This  winning  vector  is  not 
defined  a  priori  and  the  learning  results  into  the  winning  of  one  of  the  vectors  with  equal 
probability  if  the  same  conditions  are  provided.  That  is,  the  scheme  demonstrates  “flip-flop”- 
like  behavior. 

b)  Feature  extraction:  Two  orthogonal  groups  of  inputs,  each  one  composed  by  two  correlated 
vectors,  given  by  equation  (2)  are  presented  to  the  neuron  in  a  random  sequence.  After  800 
iterations  have  been  performed,  the  weight  components  are  read  out.  We  get  the  following 
distribution  of  components: 

W  =  (0.53,  0.23,  0.18,  0.22,  0.22, 1.0,  0.93,  0.93,  0.56,  0.13) 

The  system  has  recognized  and  selectively  adapted  a  common  feature  in  the  two  vectors 

S^,  (see  the  computer  simulation  results). 


4.  Discussion 

The  proposed  implementation  for  a  photorefractive  neuron  is  able  to  perform  quite  interesting 
data  processing.  We  are  currently  working  on  the  implementation  of  a  self  organizing  neural 
network  via  juxtaposition  of  a  large  number  of  identical  neural  units  in  different  locations  of 
the  same  photorefractive  crystal.  This  is  a  self-imaging  optoelectronic  feedback  system  created 
by  changing  the  photodiode  (see  fig.3)  for  a  CCD  camera  (at  the  image  plan  of  the  BaTiOa) 
and  by  replacing  the  optical  shutter  by  another  spatial  light  modulator  (which  is  imaged  on  the 
BaTi03).Very  large  capacities  should  thus  be  obtainable.  Namely,  the  maximal  dimensionality 
of  input  n  here  can  be  estimated  as  the  possible  maximal  number  of  angularly  multiplexed 
images.  Optical  photorefractive  memories  with  as  many  as  5000  angularly  multiplexed  images 
have  already  been  experimentally  demonstrated  [5].  Diffraction  limit  for  a  crystal  with 
lxlxO.5  cm3  allows  for  the  juxtaposition  of  more  then  100x100  neural  units.  An  electronic 
lateral  feedback  (to  control  neural  neighbourhood  relations)  and  presented  above  fundamental 
photorefractive  kinetics  provide  together  the  global  network  organization  algorithm  (e.g., 
Kohonen  Map).  It  would  permit  to  implement  the  higher  order  neural  adaptation  laws  (cases 
q=p=l  and  q=l;  p=2)  which  are  usually  strongly  simplified  in  computer  simulations,  limiting 
thus  network  learning  capacities  and  requiring  additional  external  intervention  [1].  This  first 
experimental  demonstration  encourages  us  to  look  for  other  nonlinear  optical  materials  as  well 
as  for  suitable  holographic  characteristics  (dynamics,  dark  memory,  geometrical  dependence 
etc.)  to  implement  different  neural  computing  algorithms. 
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Abstract.  An  attempt  is  made  to  exploit  the  inherently  rich  physics  of  volume 
holographic  interconnects.  The  effects  of  crosstalk  and  nonlinearities  are 
considered.  When  combined  with  a  novel  feedback  configuration,  complex,  self- 
organising  behaviour  is  seen  to  emerge. 

1.  Introduction. 

Volume  holographic  interconnects  [1]  represent  one  way  of  applying  optics  in  connectionist 
approaches  to  signal  processing.  They  combine  the  usual  advantages  of  optical  intercon¬ 
nects,  together  with  high  storage  capacities  (>10^  weighted  connections  cm'^)  and  offer  the 
possibility  of  weight  update  during  training  [2]. 

However,  when  used  in  the  'conventional'  way,  several  factors  constrain  their 
performance.  These  include  effects  such  as  diffracted  amplitudes  being  a  nonlinear  function 
of  grating  strength;  cross  gratings  and  multiple  grating  interactions  giving  crosstalk;  the 
recording  of  modifying  of  gratings  invariably  changing  the  strengths  of  other  gratings  in  the 
hologram  and  nonlinearities  introduced  by  the  recording  process  itself.  All  the  foregoing  can 
limit  the  use  of  volume  interconnects,  even  when  the  training  process  takes  account  of  these 
factors  [3].  An  alternative  approach  is  to  attempt  to  try  and  use  these  phenomena  to  good 
advantage  -  in  effect,  to  exploit  more  fully  the  rich  physics  of  this  dynamic,  multiple  grating 
system.  This  may  enhance  the  interconnects  capabilities  and  indeed,  may  enable  it  to 
perform  useful  information  processing  tasks  in  its  own  right.  To  investigate  this  idea,  it  is 
necessary  to  study  the  regimes  of  behaviour  of  the  interconnect  in  various  configurations.  In 
the  following,  one  such  arrangement  is  described  and  investigated.  Some  conjectures  on  the 
utility  of  the  modified  architecture  are  then  briefly  discussed. 

2.  Architecture. 

A  schematic  of  a  modified  architecture  is  shown  in  Figure  1 .  A  single  mode  fibre  has  been 
added  to  the  conventional  volume  interconnect  arrangement.  This  fibre  feeds  light  from  one 
of  the  outputs,  back  to  input  1,  via  a  beamsplitter  of  reflectivity  R.  Superficially,  this 
arrangement  has  similarities  to  the  ring  resonator  systems  studied  by  Ikeda,  Moloney, 
Anderson,  Firth  and  others  [e.g.  4],  but  here  the  resonator  path  occupies  a  channel  of  a 
dynamic,  multiple  grating  interconnect,  rather  than  passing  through  a  nonlinear  dielectric 
slab.  Additionally,  the  fibre  is  single  mode  to  remove  transverse  effects  in  the  resonator. 

An  input  pattern  (e.g.  an  image  or  the  output  of  some  array  of  processing  nodes  in  a 
connectionist  architecture)  is  presented  to  the  system  as  a  spatially  coherent,  complex 
amplitude  distribution.  The  wave  from  each  element  of  this  input  array  interferes  with  those 
from  other  elements  of  the  array  to  form  a  set  of  diffraction  gratings  in  the  dynamic 
holographic  material.  The  waves  from  the  input  array,  in  addition  to  being  responsible  for 
the  formation  of  the  gratings,  are  diffracted  by  them,  before  they  propagate  to  the  output 
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Figure  1 .  Schematic  of  the  fibre  feedback  modification  to  the  Fourier  plane,  volume  interconnect  arrangement. 
Note  the  dynamic  holographic  material  (e.g.  a  liquid  crystal  cell  or  a  photorefractive  crystal)  and  the  absence  of 

a  training  array. 

array  (perhaps  for  subsequent  post  processing).  For  a  more  details  of  the  conventional 
volume  architecture  and  the  grating  formation  see  [2]. 

3.  Theory. 

After  some  approximations  [2],  the  governing  equations  describing  the  propagation  of  light 
through  the  interconnect  can  be  written  as  : 


3,31 
a  3Jj 

1  cos 

O 

II 

g 

o 

(1) 

m^n 

3  G„,„ 

~Tr 

+  B  a„,  < 

(2) 

where  n  is  the  bulk  refractive  index  of  the  hologram  material,  c  is  the  velocity  of  light,  0,„  is 
the  angle  of  propagation  of  wave  m,  normalised  amplitude  is  the  mnth  grating 

strength,  y,  A  and  B  are  material  constants.  Equations  (1)  are  coupled  wave  equations 
describing  the  diffraction  of  the  waves  as  they  propagate  through  the  hologram;  the 
amplitude  of  each  wave  is  coupled  to  the  other  diffracting  waves  amplitudes  through  the 
appropriate  diffraction  gratings.  Note  that  cross  gratings  and  multiple  diffraction  paths  are 
included.  Equations  (2)  describe  the  grating  formation  in  the  hologram  -  for  the  case  shown, 
a  photorefractive  type  characteristic  is  assumed.  This  type  of  response  can  also  approximate 
that  of  other  materials  e.g.  liquid  crystals.  The  second  term  on  the  right  of  (2)  shows  that  the 
rate  of  growth  of  the  mnth  grating  is  related  to  the  strength  of  the  appropriate  interference 
pattern,  whilst  the  first  is  a  decay  term.  The  boundary  conditions  are:  a,„(x=0,  t)  =  a,„Q  (m#l) 
and  a,(x=0,  t)  =  a^Q{l-R)  +  R  (l-L)  a^ix=d,  t=t-x)  expG(})),  where  L  is  the  loss,  T  is  the  time 
delay  in  the  feedback  loop  and  is  the  phase  shift  associated  with  propagation  through  it. 

The  time  delayed  feedback,  combined  with  the  nonlinearities  and  time  dependent 
nature  of  multiple  gratings  in  the  hologram,  can  endow  the  arrangement  with  very  rich  and 
varied  behaviour.  In  the  following  subsections,  this  is  looked  at  more  closely,  starting  with  a 
stability  analysis  -  a  standard  procedure  when  studying  nonlinear  dynamical  systems. 
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3.1.  Stability  analysis. 

The  size  of  the  time  delay  x  can  play  an  important  role  in  the  behaviour  of  the  arrangement, 
but  a  stability  analysis  is  difficult  due  to  the  complexity  of  the  system.  Some  progress  can  be 
made,  however,  if  it  is  assumed  that  the  gratings  do  not  vary  significantly  through  the  depth 
of  the  hologram,  i.e.  G^„(0)=G^„(x)  for  (0^<d).  The  problem  then  reduces  to  studying  the 
properties  of  the  infinite  dimensional,  delay  differential  equation  : 

^  =/[G(«),G(f-T)]  (3) 

where  G  is  the  matrix  of  the  grating  strengths.  The  stability  can  be  determined  by 
perturbing  about  the  system's  steady  state  solution  G^^  and  studying  the  way  in  which  the 
perturbations  grow.  This  is  equivalent  to  analysing  the  roots  (O  of 

det  [J(G,,.  G,,)  +  J^G„,  GJ  exp(-an)  -  o)  I  ]=  0  (4) 

where  I  is  the  identity  matrix  and  the  Jacobians  J,j  =  { 3^[G;(t),G,(t-t)]/9Gj<0 }  \G(t}=G(t-x)=Gss 
and  =  ia^[G,.(t),G;(t-'C)]/9G/t-T)}|Gf„=Gf,.^,^Grj  [5]-  Roots  having  real  parts  greater  than 
zero  correspond  to  unstable  behaviour. 

In  general,  the  main  results  are  that  firstly,  the  number  of  distinct  roots  (and 
therefore  the  complexity)  of  the  system  increases  as  the  time  delay  increases,  and  secondly, 
the  system  stability  can  change  as  the  time  delay  is  varied.  These  observations  agree 
qualitatively  with  Farmer's  analysis  of  another  delayed  nonlinear  system,  the  Mackay-Glass 
equation  [6].  In  our  case,  the  effects  of  delay  start  to  become  important  at  t.A  >0  01.  For 
most  photorefractives,  time  delays  produced  by  fibres  are  therefore  too  short  to  significantly 
alter  the  interconnect  dynamics,  but  systems  incorporating  rotating  photorefractives  have 
demonstrated  suitably  long,  coherent  delays,  together  with  gain  [7].  It  may  thus  be  possible 
to  vary  the  dynamics  of  the  system  by  control  of  the  delay  time  x. 

Further  stability  analyses  in  terms  of  variation  of  other  system  parameters  have  also 
been  carried  out.  We  have  some  new  analytic  results  for  multiple  grating  systems  which 
allow  such  studies  to  be  made  without  making  the  assumption  that  the  gratings  do  not  vary 
with  hologram  depth.  Space  precludes  presentation  of  these  results  here,  but  system 
behaviour  is  found  to  vary  strongly  with,  for  example,  the  ratio  of  the  complex  photorefrac- 
tive  material  coefficients  B  to  A. 

3.2.  Bifurcation  behaviour. 

Several  bifurcation  analyses  have  been  undertaken,  illustrating  how  the  dynamics  of  the 
arrangement  evolve  with  variation  in  parameters.  Figure  2  shows  a  bifurcation  diagram 
resulting  from  the  variation  of  the  feedback  loss  L.  Suitable  adjustment  of  this  easily  varied 
parameter  should  therefore  provide  an  excellent  method  to  "tune"  the  system  dynamics. 

4.  Discussion. 

We  have  made  some  preliminary  experimental  observations  using  barium  titanate  crystals  in 
multiple  grating,  feedback  configurations.  Initial  results  do  suggest  that  complex  dynamics 
can  occur.  Such  modes  of  operation  are  being  increasingly  viewed  as  advantageous  in 
pattern  processing.  For  example,  the  ability  of  an  associative  memory  to  change  its 
dynamics,  during  a  search  task,  has  been  demonstrated  to  give  superior  classification 
performance  [8].  Additionally,  complex  dynamics  in  a  classifier  can  also  give  it  advantages 
in  noisy  environments.  The  volume  interconnect  system  described  here  also  has  direct 
equivalence  to  many  features  of  the  coupled  map  lattice  paradigm  [9].  Looked  at  in  a 
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Figure  2.  Typical  theoretical  bifurcation  diagram  of  system  output  power,  as  a  function  of  fibre  loss  parameter 
L.  Fixed  points  (e.g.  at  L=l,  0.55),  limit  cycles,  period  doubling  routes  to  chaos,  and  chaotic  behaviour  with 
'gaps'  where  periodic  motion  returns,  can  all  be  seen.  N=l,  •)^=50,  ajQ=a2o=\S),  ^=k/2=ZB/A,  IB/4I=0-327. 


slightly  different  way,  the  iterative  processing,  generated  by  the  feedback,  endows  the 
system  with  properties  not  obtained  with  layered  networks  of  comparable  size. 

Self  organisation  in  pattern  processing  can  be  useful  as  a  preprocessing  stage  of  a 
pattern  recognition  system  (e.g.  as  in  Kohonen’s  self  organising  feature  maps  [10]).  The 
interconnect  described  here  also  has  its  own  self  organising  features.  In  particular,  it  is  able 
to  associate  input  patterns  with  different  attractors  at  its  output.  Inputs  that  differ  slightly 
(within  the  corresponding  basin  of  attraction)  can  give  rise  to  the  same  attractor.  In  this  way, 
the  system  performs  classification  and  clustering  without  external  supervision. 

5.  Conclusion. 

A  system  has  been  formulated  that  attempts  to  exploit,  more  fully,  the  complex  physics 
available  in  the  volume  interconnect.  This  includes  the  use  of  many  processes  convention¬ 
ally  regarded  as  detrimental  to  interconnect  behaviour.  By  suitable  control  of  accessible 
parameters  (e.g.  L),  it  may  be  possible  to  "tune"  the  behaviour  of  the  arrangement.  When 
combined  with  the  merits  of  the  conventional  volume  interconnect,  unusual,  and  possibly 
beneficial,  modes  of  behaviour  may  result. 
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Abstract.  We  have  developed  a  new  holographic  associative  memoiy  (HAM)  which 
can  be  implemented  easily  using  learning  patterns  derived  from  an  adaptive  learning 
rule.  The  principle  of  learning  pattern  method  (LPM)  and  simulation  results  of  the 
present  HAM  are  presented. 


1.  Introduction 

Although  the  optical  implementation  of  an  associative  memory  such  as  the  Hopfield  model  can 
be  done  easily  with  vector-matrix  multiplications[l]  or  holographic  systems[2],[3],  the 
performance  may  be  limited  due  to  the  practical  difficulties  of  having  pseudo-orthogonality  in 
memory  patterns  and  the  existence  of  undesirable  stable  states.  The  LPM  utilizes  the  high 
performance  of  adaptive  learning  by  using  the  simple  outer-product  learning  implementation  of 
HAM. 


2.  Principle  of  LPM 


Let  b'^  be  the  m-th  memory  pattern  to  be  stored,  then  the  error  function  E  is  defined  as 


M  N 


m  ij 


(1) 


where  o"'and  is  the  actual  output  state  and  its  target  state(i.e.  memory  pattern), 

respectively. 


o;=f 


(2) 


We  have  adopted  gradient-descent  algorithm  to  get  the  optimal  interconnection  weight  W. 


428 


ijki 


-n 


a?: 


ijki 

("k, 


(3) 


where,  ri  is  a  constant  of  learning  rate.  The  above  equation  is  used  so  as  to  obtain  the 
interconnection  matrix  of  discrete  and  positive  elements  with  appropriate  value  of  ri.  After 
optimization  the  the  interconnection  matrix  W  can  be  expressed  as  the  liner  combination  of 
binary  matrices  as  follows: 

^=t«r=iax(v'y 

J=1  S=\ 


where  ag  is  the  constant  coefficient  and  the  matrix  r"is  the  outer  product  between  learning 
patterns  u^and  v®  with  binary  elements  (1,0). 


3.  Holographic  Implementation  of  LPM 

The  three  two-dimensional  binary  patterns  L,  P,  and  M  are  selected  as  the  memory  patterns  as 
shown  in  Fig.l,  and  they  can  be  written  in  the  form  of  25(5x5)-bits  vectors  as  follows. 


(L)  :  Zr'  =(1000010000100001000011111), 

(P)  ;  =(1111010001111101000010000), 

(M) :  =(1000111011101011000110001). 


Fig.  1 .  Three  two-dimensional  memory  f>attems 


2  1  I  1  0  2  S  0  0  I  2  I  I 
0333000000030 
0333000000030 
0333000000030 
0000303030000 
2  I  I  I  0  2  0  0  0  1  2  1  1 
0000303030000 
OOOOOOOOOOOOO 
0000303030000 
0  2  2  2  2  0  2  0  2  4  0  2  4 
2  111  0  2  0  0  0  1  2  1  1 
0333000000030 
0222202024024 
0  3  3  3  0  0  0  0  0  0  0  3  0 
0000303030000 
2  11  1  0  2  0  0  0  1  2  11 
O0O0OOOOO0OOO 
OOOOOOOOO00O0 
0000000000000 
0000303030000 
2  1  110  2  0  0  0  1  2  11 
0000000000000 
OOOOOOOOOOOOO 
0000000000000 
0000202020000 


102000021111 
300000000000 
3  0  0  0  0  0  0  0  0  0  0  0 
300000000000 
030000300000 
1  0  2  0  0  0  0  2  1  1  1  1 
030000300000 
OOOOOOOOOOOO 
030000300000 
220000200000 
102000021111 
3  0  0  0  0  0  0  0  0  0  0  0 
220000200000 
300000000000 
030000300000 
102000021111 
OOOOOOOOOOOO 
000000000000 
OOOOOOOOOOOO 
030000300000 
1  0  2  0  0  0  0  2  11  1  1 
000000005550 
000000005550 
000000005550 
020000203335 


Fig.  2.  The  interconnection  matrix  W 
obtained  from  the  adaptive  learning  rule. 
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Figure  2  illustrate  the  25x25  interconnection  matrix  obtained  by  using  the  learning  algorithm  in 
Sec.  2.  We  can  derive  ten  pairs  of  learning  pattern  pairs  (v®,u®)  and  their  coefficients  ag  out  of 
the  matrix  W after  simple  algebra  (Fig.3).  As  shown  in  Fig.  4,  the  interference  patterns  between 
the  collimated  beam  passing  through  the  pattern  v®  and  the  scattered  beam  from  the  pattern  u® 
and  ground  glass  construct  holographically  the  outer-product  between  v®  and  u®.  [3] ,  [4]  The 
coefficients  controls  the  exposure  time.  The  experimental  setup  for  the  retrieval  of  stored 
informations  is  shown  in  Fig.  5. 


Otj  —  1  CL2  ~  1  0^3  ~  ^  ^4  “■  ^  ^5  ”  ^  ^  ^7  ~  ^  ^8  ~  ^  ^9  ~  ^  ^10  ~  ^ 


Fig.  3.  Ten  learning  pattern  pairs  (v®,u®)  and  their  coefficients  (s=l,2,...,10). 


Fig.  4.  Schematic  diagram  of  recording  the  outer- 
product  of  learning  patterns  holographically. 


Fig.  5.  Setup  of  the  retrieval  of  the  memory  pattern  for  the  input  pattern. 
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The  energy  function  Uis  defined  as  follows: 

^  ij  kl 

where  Nq  is  the  number  I's  of  input  and  X  is  a  constant.  Since  the  elements  of  the  matrix  are 
positive,  the  input-dependent  threshold  level  should  be  used.  It  is  important  to  use  appropriate 
value  of  X,  because  the  stable  state  of  energy  and  the  input-dependent  threshold  level  depend 
on  it.  Figure  6  shows  the  simulation  results  of  the  LPM  for  the  various  values  of  x  The 
recognition  probability  is  defined  as  the  number  of  correct  recognition  per  total  number  of  the 
inputs.  The  simulations  were  repeated  over  300  randomly  generated  inputs  of  each  Hamming 
distance  (0  ~  15),  and  the  results  are  averaged  for  the  three  memory  vectors  L,  P,  and  M.  We 
could  get  best  results  in  the  range  of  o.6  <  X  <  0.7-  In  conclusion,  all  memory  patterns  can  be 
stored  in  stable  state  and  the  recognition  capability  was  improved  considerably  by  using  the 
learning  patterns  instead  of  direct  using  the  memory  patterns. 


Fig.  6.  Recognition  probability  versus  Hamming  distance  of  input 
patterns  with  various  constant  scale  factors  X. 
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Abstract 

An  optical  associative  memory  with  bipolar  edge- 
enhanced  feature  learning  by  using  a  ferroelectric 
liquid  crystal  spatial  light  modulator  and  a  barium 
titanate  crystal  is  presented.  A  high  discrimina¬ 
tion  ability  is  achieved.  Experimental  result  and 
computer  simulation  are  given. 


1.  Introduction 

In  an  associative  memory,  the  recalled  output  is 
the  weighted  summation  over  all  the  stored  pat¬ 
terns.  The  weight  is  determined  by  the  similar¬ 
ity  of  the  input  stimulus  and  the  stored  pattern. 
To  achieve  a  better  association  we  increase  the 
wanted  weight  and  suppress  the  unwanted  ones. 
We  can  find  the  same  objective  in  optical  corre¬ 
lation  pattern  recognition  where  we  seek  better 
discrimination  abilities  for  a  correlator.  Therefore 
techniques  used  in  pattern  recognition  are  appli¬ 
cable  to  the  optical  implementation  of  as.sociative 
memories.  The  phase-only  filter  (POF)  has  been 
known  for  its  optimal  discrimination.  Investiga¬ 
tions  revealed  that  the  magnitude  squared  of  the 
impulse  response  of  the  optical  correlation  system 
using  a  POF  is  the  edge  outline  of  the  original  ref¬ 
erence  and  the  histogram  of  the  impulse  response 
consists  of  both  positive  and  negative  values;  the 
positive  part  lies  predominantly  inside  the  outline 
while  the  negative  part  lies  predominantly  outside 
the  outline-  Therefore  we  can  produce  ternarily 
valued  edge-enhanced  versions  of  patterns  to  be 
stored  by  the  associative  memory.  In  the  edge- 
enhanced  version  of  a  pattern,  pixels  are  assigned 
values  of  1  and  -1,  respectively,  if  their  positions 
in  the  original  pattern  are  adjacent  to  the  outline 
of  the  object  and  are  inside  and  outside  the  ob¬ 
ject,  respectively,  while  the  other  pixels  are  given 
the  value  of  0.  The  bipolar  edge-enhanced  pat¬ 
tern  has  equally  distributed  positive  and  negative 
pixels,  which  meets  the  equal-distribution  require¬ 
ment  for  an  associative  memory. 

We  set  up  an  optical  associative  memory  which 
uses  a.  barium-titanate  photorefractive  crystal 
(PRC)  as  the  memory  recording  medium,  a  fer¬ 
roelectric  Spatial  Light  Modulator  (SLM)  as  the 
programmable  input  device  and  a  charge  cou])led 


device  (CCD)  camera  as  the  output  detector.  The 
use  of  a  PRC  enables  the  interconnect  hologram 
to  be  recorded  in  real-time,  which  also  introduces 
reconfigurability  into  the  system.  The  use  of  the 
,SLM  permits  a  programmable  system. 


2.  Holographic  associative  memory 


Suppose  that  there  are  M  patterns  /j,  i  = 
1, 2, . . . ,  M,  with  binary  values  (0,  1),  to  be  stored 
in  a  holographic  associative  memory.  There  ex¬ 
ists  an  input-output  pair  of  patterns,  //  and 
used  in  hologram  recording  for  the  storage  of  pat¬ 
tern  fi.  When  the  learning  process  is  finished  and 
the  associative  recall  is  in  progress  with  a  stimu¬ 
lus,  s,  being  input  into  the  system,  the  expected 
weighted  output,  E,  satisfies 


M 

£  =  (u 

where  {s  ■  f-],  known  as  the  weight  of  //'  in  the 
output,  represents  the  inner  product  of  patterns  s 
and  f'.  A  threshold  on  the  expected  output  gives 
the  actual  output  of  the  system  expressed  by 


0  =  T 


■  M 

.1  =  1 


(2) 


where  T[  ]  denotes  a  sigmoid-like  threshold  func¬ 
tion.  If  an  original  pattern  is  correctly  memorized, 
the  thresholded  output  must  be  that  pattern  when 
it  is  input  into  the  system.  To  achieve  a  best  per¬ 
formance,  we  should  have  the  iinthresholded  out¬ 
put  be  as  similar  to  the  correct  stored  as  possible, 
One  of  the  methods  is  to  design  the  patterns  f' 
and  /•',  by  othogonaiization  preprocessing  for  ex¬ 
ample  [1],  so  that  when  one  of  the  stored  pattern 
is  used  as  the  stimulus  the  iinthresholded  output 
is  the  stored  pattern  itself.  Another  method  is  to 
leave  the  function,  /•',  to  be  the  original  |>att,erii, 
and  to  design  //  only  so  that  the  weight  for 
the  correct  stored  memory  is  much  higher  than 
all  the  other  weights.  The  bipolar  edge  enhanced 
learning  belongs  to  the  later. 


3.  Learning  principle  of  the  optical  sys¬ 
tem 
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The  optical  system  performing  the  bipolar  edge- 
enhanced  feature  learning  is  schematically  shown 
in  Fig.  1.  A  laser  beam  is  enlarged  and  focused 


Figure  1:  Optical  system  for  the  learning;  HPi  and 
HP2  are  halfwave  plates,  Li  -  L5  are  lenses,  Mi  - 
M4  represent  mirrors,  and  Pi,  P2,  and  P3  denote 
polarizers. 

onto  the  SLM  by  lenses  Li  and  L2.  A  computer 
generated  hologram  (CGH),  which  fans  out  the 
incident  beam  into  33  x  33  equal-intensity  beams, 
is  used  to  pixelate  the  patterns  displayed  on  the 
SLM  and  to  ensure  that  the  high  contrast  ratio  of 
the  SLM  is  not  compromised  by  the  scattering  by 
interpixel  dead  space.  The  other  purpose  of  us¬ 
ing  a  CGH  is  to  introduce  a  random  phase  func¬ 
tion.  The  original  pattern  and  its  bipolar  edge- 
enhanced  version  are  simultaneously  displayed  on 
a  binary  SLM  with  the  edge  enhanced  and  the 
original  corresponding  to  the  input  and  output  in 
the  input-output  pair,  respectively. 

The  ternary  modulation,  as  illustrated  in  Fig.  2, 
is  realized  in  the  following  way;  Two  pictures,  la- 


Figure  2:  Ternary  modulation  of  a  binary  SLM;  a 
number  in  a  box  denotes  the  value  written  on  the 
SLM,  and  a  number  in  a  circle  is  the  value  repre¬ 
sented  by  the  light  field. 

belled  B  and  C,  respectively,  are  simultaneously 
displayed  on  the  SLM.  The  fan-out  beams  of  the 
CGH  illuminate  B.  Beams  passing  through  B  will 
be  reflected  by  mirror  M2  onto  pattern  C.  The 


system  is  so  arranged  that  B  is  coincidedly  im¬ 
aged  pixel-wise  onto  C-  Polarizers  Pi,  P2  and  P3 
are  so  placed  that  light  passing  through  B  (and  A) 
is  binarily  amplitude  modulated  while  light  pass¬ 
ing  back  through  C  is  binarily  phase  modulated. 

For  each  of  the  to-be-stored  patterns,  an  orig¬ 
inal  pattern,  fi{x,y),  is  displayed  in  binary  am¬ 
plitude  form  on  the  SLM  at  position  A  while  two 
patterns  are  placed  at  positions  B  and  C,  respec¬ 
tively,  to  form  a  ternary  edge-enhanced  version  of 
that  pattern,  f-{x,y).  Light  passing  through  A 
goes  via  the  upper  path  of  the  system  with  the 
CGH  and  pattern  A  being  imaged  onto  the  PRC 
and  CCD,  respectively,  by  lens  L5.  So  we  have 
the  field  behind  the  SLM  be  expressed  by 


Ui{x,y)  =  fi{x,y)n{x,y),  (3) 

where  {x,  y)  denotes  the  coordinates  system  of  the 
SLM  plane, 

16  16 

nix,y)  =  exp  n)] 

m=—  16  n=:  — 16 

Six  —  md,  y  ~  nd)  (4) 


denotes  the  diffraction  of  the  CGH,  d  is  the  period 
of  the  SLM  pixels,  and  coefficient  coiisi.anis  have 
been  dropped  for  convenience.  It  can  be  shown 
that  the  light  distribution  on  the  PRC  is 


■u'(a,/?)  =  exp 


ik  ,  2 


^(k{d,+d2)^kid,^do)^ 

\  ^1^3  dids 


di  -k  d')  di  +  d2 


c?3 


(5) 


where  (a,/?)  is  the  coordinate  system  in  the  PRC 
plane.  Ajip,(])  is  the  Fourier  transform  of /R.c,  </), 
wiiich  satisfies 

•TRp,  q)  =  j  fiix,  y)  exp  +  qy)] 

dxdy,  (6) 


/i{'U,  n)  is  the  transmittance  of  the  CGH,  that  i.s 


'H{x,  y) 


J  /}(w,  n)  exp 


ik 

■  —  ixu  +  yv) 

di 


du  dv 


a) 


*  denotes  convolution,  and  k  is  the  wave  vector  of 
the  light  source. 

Light  passing  through  both  B  and  C  is  directed 
by  the  lower  path  of  the  .system  with  C  being  im¬ 
aged  onto  the  PRC.  The  light  distribution  on  the 
crystal  for  this  path  is  written  as 


ei{(xj3)  =  exp 


ik{dr,-kdQ)  2  , 
- ,0 . -(q  -t-  /i  ) 


dr, 


ds  dr, 
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The  light  from  both  paths  overlaps  and  inter¬ 
feres  there.  Photorefractive  effect  creates  a  vol¬ 
ume  phase  hologram  inside  the  PRC.  We  have  the 
diffraction  term  been  expressed  as 

<i(a,/9)  =  e7(a,/?)ti'(a,/?),  (9) 

where  +  in  superscript  means  complex  conjuga¬ 

tion.  Here  we  ignore  the  crystal  thickness  and  the 
angle,  9,  made  by  the  two  paths  for  simplicity. 
But  this  ignorance  does  not  effect  the  result  for 
the  associative  recall  [1]. 

Patterns  are  memorized  by  being  input  into  the 
SLM  in  turn  [1,  2].  When  all  the  learning  has 
been  fulfilled,  we  can  modify  the  above  equation 
to  obtain  the  final  diffraction  expression 

M 

=  (10) 

»  =  1 

Further  learning  is  permitted  whenever  it  is 

needed.  But  the  learning  of  one  of  the  new  mem¬ 
ories  will  cause  the  previously  stored  memories’ 
being  gradually  forgotten.  We  have  to  balance 
the  tradeoff  to  get  an  optimal  learning  procedure, 
that  is  to  justify  the  photorecording  time  length 
for  each  of  the  memories  according  to  their  posi¬ 
tion  in  the  learning  sequences. 


4.  Associative  recall 


Associative  recall  is  carried  on  in  the  system  with 
the  upper  path  being  blocked.  Besides  feeding  an 
edge-enhanced  pattern,  we  can  also  use  an  origi¬ 
nal  pattern  or  its  incomplete  version  as  the  input 
stimulus.  Feeding  an  input  stimulus,  s{x,y),  into 
the  SLM  at  place  B  or  C,  we  have  the  light  dis¬ 
tribution  just  in  front  of  the  crystal  as 


i{a,  I3)  —  exp 


Passing  through  the  crystal,  effected  by  the 
recorded  volume  hologram,  the  light  field  just  be¬ 
hind  the  crystal  is  given  by 


s'(«,/?)  =  s(o,/?)f(a,/i/)  (12) 


The  intensity  distribution  on  the  output  plane,  i.e. 
the  CCD  plane,  is  simplified  and  given  by 


■  f'i)  fi 


2 

.  (13) 


where 


dif/3 

-|-  d2)dA 


(14) 


This  is  similar  to  Eq.  (1)  except  for  the  mod¬ 
ulus  squared.  The  intensity  distribution  at  the 


output  plane  is  the  weighted  summation  of  all 
the  stored  patterns.  The  weights  of  the  weighted 
output  are  determined  by  the  inner  products  of 
the  input  stimulus  and  the  corresponding  edge- 
enhanced  patterns.  If  s  is  an  original  pattern,  /fc, 
or  a  portion  of  it,  the  inner  product  of  fk  and  fj. 
is  relatively  much  higher  than  those  between  fk 
and  other  memories.  Hence  high  discrimination 
amongst  the  stored  patterns  is  achieved. 

A  threshold  on  results  in 

=  (15) 

If  the  input  stimulus  is  an  incomplete  version  of  a 
memorized  pattern,  this  thresholded  output  may 
not  be  an  exact  replica  of  the  corresponding  mem¬ 
ory.  A  feedback  process  is  needed  to  readdress  the 
SLM  with  this  thresholded  output  as  the  new  in¬ 
put  stimulus.  The  feedback  enables  the  system 
to  update  its  output,  in  which  each  loop  yields  a 
more  similar  output  to  that  stored,  until  a  stable 
output  is  achieved. 

5.  Experimental  demonstration  and  com¬ 
puter  simulation 

5.1  Optical  demonstration 

Five  13  X  13-pixel  Chinese  characters,  four  as 
memory  patterns  and  one  as  false  input  stimu¬ 
lus,  were  chosen  for  optical  demonstration.  The 
edge-enhancement  was  carried  out  in  two  steps. 
At  the  first  step,  the  0- valued  pixels  adjacent  to 
a  1-valued  pixel  were  set  to  —1  while  the  others 
were  kept  unchanged  and  secondly,  the  1-valued 
pixels  with  no  -1  valued  near  neighbors  were  set 
to  U  while  the  others  were  unchanged. 

We  calculated  the  weights  for  both  the  original 
bipolar  Hopfield  model  (BH)  [2,  3],  where  both 
the  input  stimulus,  s,  and  the  original  pattern,  /,;, 
take  bipolar  form,  and  our  bipolar  edge  enhanced 
model  (EE).  An  average  improvement  of  discrimi¬ 
nation  from  73%  (BH)  to  80%  (EE)  was  achieved. 
Discrimination  is  defined  as  the  average  ratio  of 
difference  in  the  weights  for  the  expected  memory 
and  an  unexpected  memory  to  the  weight  of  the 
expected  memory.  This  system  is  equivalent  to  a 
high-order  nonlinear  one  with  an  order  of  1.3. 

Figure  3(a)  shows  the  recalled  output  where  a 
stored  pattern  is  u.sed  as  the  recall  stimulus.  Out¬ 
put  threshold  level  is  determined  by  the  input 
energy.  The  output  for  a  nonstored  stimulus  is 
shown  in  Fig.  3(b).  There  is  a  dramatic  dro].^  in 
the  output  energy,  normalized  to  the  input  en¬ 
ergy,  when  a  nonstored  pattern  is  input  into  the 
system.  This  implies  that  the  system  has  a  strong 
discrimination  among  the  stored  and  nonstored 
patterns.  It  also  suggests  that  this  phenomenon 
can  be  used  as  a  measure  to  decide  whether  the 
ini)ut  is  a  memory  of  the  system. 
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Figure  3:  Recalled  output  when  the  input  stimulus 
is  (a)  a  stored  pattern  or  (b)  a  non-stored  pattern. 


Partial  information  addressability  was  also  in¬ 
vestigated,  where  the  partial  information  was  the 
right  half  of  the  original  pattern.  The  results  in¬ 
dicate  that  the  system  can  perform  the  partial 
association  except  for  a.  bit  of  decrease  of  the  out¬ 
put  signal  to  noise  ratio.  When  input  stimuli  are 
the  edge-enhanced  versions,  the  output  is  better 
than  that  in  the  first  two  cases.  The  edge  en¬ 
hanced  versions  of  the  patterjis  can  be  t.i'eated  as 
pha.se  coded  addre.ss,  so  the  system  is  addressable 
by  both  address  and  content  information. 


set,  which  consists  of  3755  characters  in  16  x  16 
bitmap  format.  We  simulated  the  relationship  be¬ 
tween  the  retrieval  error  and  the  number  of  the 
stored  patterns,  as  shown  in  Fig.  4,  averaged  over 


Figure  4:  Retrieval  error  via  number  of  stored  pat¬ 
terns  in  computer  simulation. 

200  independent  simulations  where  stored  pat¬ 
terns  were  selected  randomly  from  the  3755  char¬ 
acter  library.  We  calculated  the  cases  for  the  Hop- 
field  model  (HM),  its  bipolar  counterpart,  i.e.  the 
BH,  and  EE.  Computer  simulations  show  that  the 
retrieval  error  for  EE  is  much  .smaller  than  those 
of  HM  and  BH.  When  the  stored  pattern  number 
is  9,  the  error  for  the  three  are  33.5%  (HM),  17.8% 
(BH)  and  6.99%)  (EE),  respectively.  We  also  pro- 
po.sed  a.  modified  EE  algorithm  (ME),  where  the 
edge-enhancement  is  .so  carried  out  that  for  a  1- 
valued  pixel  its  value  will  be  reset  to  be  the  num¬ 
ber  of  its  0-valued  nearest  neighbour  pixels,  and 
for  a  0- valued  pixel -its  value  will  be  reset  to  the 
negative  of  its  1-valued  nearest  neighbour  pixels. 
With  the  same  simulation  as  above  we  find  the 
corresponding  retrieval  error  is  now  0.732%,. 

The  work  of  Dr.  Xu-Ming  Wang  was  supported 
by  K.  C.  Wong  Education  Foundation  on  a  Royal 
Society  K.  C.  Wong  Fellowship.  The  work  of  Mr. 
Jian  Wang  is  supported  by  a  K.  C.  Wong  Schol¬ 
arship. 


5.2  Computer  simulation 
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We  also  performed  a  computer  simulation  to  eval¬ 
uate  the  retrieval  error  of  the  proposed  learning 
algorithm.  Most  of  the  performance  evaluation 
of  a  neural  network  is  statistically  carried  out  by 
computer  simulation  where  the  stored  patterns 
are  randomly  generated.  But  a  random  pattern 
might  be  meaningless  in  practical  applications, 
where  patterns  iirvolved  are  obtained  from  the  real 
world.  So  the  storage  ca|)aci(.y  is  tower  than  the 
simulation  ba.sed  on  random  patterns. 

To  do  the  statistical  evaluation  on  the  re¬ 
trieval  error  of  the  jrropo.sed  h'arniiig  algorithm, 
we  choo.se  the  frecpiently  used  Chinese  character 
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INTRODUCTION 

Associative  memories  were  the  first  optical  neural  network  architecture  to  be 
implemented^’^.  Previous  optical  implementations  have  some  drawbacks  such  as  the  need 
for  an  interruption  of  the  light  beam  path  between  the  different  layers  of  the  neural 
network  due  mainly  by  the  lack  of  suitable  optical  thresholding  units.  As  a  consequence, 
most  of  the  implementations  so  far  used  detectors  coupled  with  light  sources'^.  This 
solution  however  may  cause  data  bottlenecks  while  requiring  additionnal  components. 

Paek  and  Psaltis  have  also  proposed  an  all-optical  associative  memory  based  on  a 
double  correlator  architecture^.  However,  it  lacked  translation  invariance  due  to  the  use  of 
pinholes  to  suppress  sidelobes  of  the  correlation  peaks.  It  also  required  the  use  of 
additional  components,  such  as  a  powerful  laser  and  expensive  optical  components,  i.e., 
an  image  intensifier  and  a  liquid  crystal  light  valve,  that  makes  this  approach  impractical. 

A  feed-forward  associative  memory  implementation  based  on  the  double  correlator 
architecture  is  proposed.  In  order  to  allow  translation  invariance,  all  the  pixels  of  the  input 
plane  have  to  be  analyzed.  This  may  be  acccomplished  by  means  of  an  optoelectronic 
thresholder^.  This  module  also  allows  an  uninterrupted  propagation  path  so  only  one 
HeNe  laser  source  with  a  relatively  low  power  of  35  mW  is  required  to  operate  the 
associative  memory.  The  holograms  required  to  perform  both  the  correlation  and  the 
association  are  computer-generated  holograms  with  a  relatively  small  number  of  degrees 
of  freedom.  In  order  to  obtain  good  image  quality  and  sufficient  diffraction  efficiency,  the 
associative  layer’s  CGHs  are  computed  using  the  global  iterative  coding  method^. 
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ASSOCIATIVE  MEMORY  ARCHITECTURE 

This  associative  memory  is  based  on  the  double  correlator  architecture.  Our 
implementation  is  shown  in  Fig.  1 .  An  input  image  is  presented  to  the  first  correlator.  The 
choice  of  filter  is  crucial  as  it  directly  determines  the  signal  that  will  be  transmitted  to  the 
second  layer  of  the  neural  net.  The  second  layer  simply  performs  the  recall  of  the 
information  stored  in  the  second  memory.  The  first  filter  requires  a  good  discrimination 
and  a  sharp  peak  of  correlation  to  approximate  a  delta  function.  The  phase-only  filter 
satisfies  those  two  requirements  ans  is  used  in  the  first  layer  of  the  neural  network. 


FIGURE  1.  Optical  memory  architecture. 


An  optoelectronic  module  was  inserted  between  the  two  layers  to  perform  the 
thresholding  operation.  This  module  taps  the  output  of  the  first  layer  and  modulates  the 
output  of  an  electrically  addressed  spatial  light  modulator  in  order  to  attenuate  small 
signal  intensities  while  keeping  higher  signal  values  unchanged.  This  kind  of  thresholder 
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allows  the  light  path  to  be  uninterrupted  so  that  the  photons  arriving  at  the  system  output 
come  directly  from  the  input  laser  without  seeing  any  electronic  relay.  Part  of  the  photons 
are  detected  by  the  tapping  section  of  the  threshold  module^. 

It  should  be  noted  that  the  threshold  is  applied  to  each  point  of  the  output  plane  of 
the  first  layer  over  a  spatial  extent  given  by  the  resolution  of  the  modulator.  Such  an 
operation  allows  to  track  the  input  object  even  as  its  location  changes.  The  thresholder 
maintains  the  translation  invariance  property  exhibited  by  the  first  correlator.  The 
thresholder  not  only  eliminates  the  cross  correlations  but  also  helps  to  sharpen  the 
detection  peak  by  attenuating  the  sidelobes  with  intensities  lower  than  the  threshold  value. 
The  delta  function  is  more  closely  approximated  after  the  thresholding  operation. 

After  this  process,  the  recall  is  carried  out  by  the  second  layer.  In  order  to  obtain  a 
good  reconstruction  quality,  a  CGH  is  calculated  with  the  help  of  the  global  iterative 
coding^  (GIC).  This  coding  allows  the  fast  computation  of  the  CGH  and  a  good 
reconstruction  quality.  Furthermore,  the  use  of  the  GIC  allows  to  detect,  in  the  first 
diffraction  order,  more  than  eight  times  the  energy  detected  with  conventional  coding. 


EXPERIMENTAL  RESULTS 

An  optical  associative  memory  corresponding  to  the  set-up  of  Fig.  1  was  built.  The 
filters  were  recorded  on  high-resolution  photographic  film  with  the  help  of  a  laser  writer. 
The  input  images  were  binary  images  of  256X256  pixels  representing  standard  security 
symbols.  The  first  filter  had  dimensions  of  256X256  pixels,  the  dimensions  of  the  second 
filter  were  only  of  128X128  pixels.  This  small  number  of  degree  of  freedom  required  the 
use  of  a  coding  scheme  scheme  with  good  performance  on  small  supports,  such  as  the 
GIC.  A  single  35  mW  He-Ne  laser  was  used  to  perform  the  experiments. 


FIGURE  2.  Occluded  input  by  50  %  (left)  and  corresponding  output  (right). 
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The  system  behavior  when  a  degraded  object  is  presented  at  the  input  is  illustrated  in 
the  Fig.  2.  The  objects  to  be  detected  were  the  two  biological  hazard  symbols,  whereas  the 
fire  and  wheat  symbols  have  to  be  discarded.  In  this  experiment,  50%  of  the  input  image 
was  occluded.  The  system  recognized  the  object,  and  the  association  was  correct.  The 
amount  of  degradation  allowed  by  the  system  strictly  depends  on  the  choice  of  the  filter  in 
the  first  layer.  A  filter  showing  good  discrimination  capabilities  will  allow  the  neural 
network  to  be  robust.  Note  that  only  one  cycle  had  to  be  performed  to  obtain  the  output. 


CONCLUSION 

A  continuous  optical  associative  memory  was  implemented.  Experimental  results 
show  a  good  behavior  of  the  system  in  the  presence  of  unwanted  objects  and  of  degraded 
targets.  The  neural  network  can  be  the  basis  of  a  system  allowing  an  input  scene  to  be 
cleaned  from  unwanted  noise  or  objects  while  the  desired  occluded  objects  could  be 
reconstructed.  The  architecture  is  versatil  because  it  can  easily  be  modified  by  changing 
the  filters  in  the  first  and  in  the  second  layers  of  the  neural  network. 
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Abstract.  Two  phase  conjugate  mirrors,  used  during  recording,  improve  the  diffraction 
efficiencies  of  angularly  multiplexed  holographic  gratings.  These  were  used  to  demonstrate  a 
phase  conjugate  resonator  with  15  modes  using  non-degenerate  multi-beam  induced  phase 
conjugation. 


1.  Introduction 

There  has  been  considerable  interest  in  the  use  of  resonators  incorporating  photorefractive 
holographic  memories  with  phase  conjugate  mirrors  to  give  optical  feedback  for  use  in  optical 
associative  memories [1-3].  Difficulty  has  been  experienced  in  realising  such  resonators  in 
practice  due  to  the  poor  diffraction  efficiencies  of  the  holograms.  In  this  paper  we  propose  and 
demonstrate  a  novel  technique  in  which  two  phase  conjugate  mirrors  are  used  during  the 
recording  to  store  holograms  with  high  diffraction  efficiencies  suitable  for  use  in  a  resonator. 
We  go  on  to  demonstrate  such  a  phase  conjugate  resonator  with  15  modes. 

Consider  the  recording  of  a  single  grating  in  a  thick  (~0.7cm)  photorefractive  crystal. 
During  recording,  the  refractive  index  modulation,  An,  is  determined  by  ^0[4], 

where  ^sat  saturation  value  of  refractive  index  modulation,  and  is  the  recording  time 
constant.  The  refractive  index  modulation  magnitude  decays  exponentially  with  depth  into  the 
crystal  since  the  writing  beams  are  absorbed  progressively  with  depth  of  penetration.  In  our 
technique  we  generate  two  phase  conjugate  beams  from  the  remnants  of  the  two  writing  beams 
succeeding  to  completely  penetrate  through  the  crystal.  The  beams  are  generated  using  two 
phase  conjugate  crystals.  One  is  in  the  "Cat"[5]  configuration  and  phase  conjugates  the  object 
beam.  The  other  crystal  is  "induced" [6,7]  to  enable  it  to  phase  conjugate  complex  reference 
beams  which  may  have  components  propagating  over  a  wide  range  of  angles.  In  this  way  both 
the  reference  beam  and  object  beam  (no  matter  how  complex)  are  returned  to  retrace  their 
original  paths  but  in  the  opposite  direction  with  the  same  phase  as  they  had  on  their  incoming 
journeys.  So  the  returning  beams  add  constructively  to  the  original  writing  beams.  In  addition, 
the  grating  that  the  phase  conjugate  beams  create  is  exactly  in  phase  with  the  previous  grating, 
and  therefore  the  latter  gets  enhanced  and  sustained.  However,  the  amplitude  of  the  returning 
conjugate  beams  decays  exponentially  from  the  back  to  the  front  of  the  crystal  in  the  opposite 
sense  to  the  initial  writing  beams.  This  offsets,  to  a  some  extent,  the  original  non-uniform  depth 
profile  of  refractive  index  modulation,  resulting  in  better  utilisation  of  the  crystal  volume  and  a 
higher  diffraction  efficiency. 
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Fig.  1  Schematic  showing  experimental  layout  for  high  efficiency  photorefractive  storage. 

2.  Experiment  and  Results 

In  our  experiments  (Fig.  1)  we  used  an  ADLAS  400mW  diode  pumped,  frequency  doubled 
Nd:YAG  laser.  Both  reference  and  object  beams  were  plane  and  collimated  The  total  exposure 
intensity  was  56mW/cm2  on  the  crystal  surface  and  both  beams  had  an  equal  intensity  and 
diameter.  The  size  of  the  region  illuminated  was  a  circle  of  diameter  1.5mm,  Each  beam  was 
angled  at  0=45°  to  the  normal  and  was  extraordinarily  polarised  (in  the  plane  of  the  figure). 
The  photorefractive  crystal,  grown  by  Shanghai  Institute  of  Ceramics,  at  the  Chinese  Academy 
of  Science,  was  Y-cut  lithium  niobate  with  0.05%  weight  of  iron  doping  resulting  in  a  34±0.2% 
transmissivity.  Lenses  were  used  to  focus  the  wide  beams  emerging  from  the  iron  doped 
lithium  niobate  into  the  small  barium  titanate  crystals  (5mm  cubes)  used  as  the  phase  conjugate 
mirrors.  The  reflectivities  of  the  self-pumped  crystals  with  incident  intensity  of  28  mW/cm^ 
was  30%  and  35%.  The  first  crystal  was  grown  in  China  and  supplied  by  Photox  Optical 
Systems,  UK  and  the  second  by  Hughes  in  USA.  The  diffraction  efficiency  was  measured 
during  the  recording  by  using  a  red  probe  beam  from  a  He-Ne  laser  (Ap=632,8nm). 

The  experimental  results  (Fig.  2)  show  a  plot  of  the  diffraction  efficiency  as  a  function  of 
time.  Curve  1  was  obtained  experimentally  without  phase  conjugate  mirrors  (PCMs)  while 
curve  2  was  obtained  with  phase  conjugate  mirrors.  Both  curves  have  similar  forms  showing 
initial  growth  and  relaxation  back  to  a  lower  more  stable  value  with  time.  The  curves  grow  at 
the  same  rate  for  the  first  15  seconds  and,  thereafter,  the  curve  using  PCMs,  grows  at  a  much 
faster  rate  than  that  without  PCMs  and  attains  a  higher  efficiency  of  22.1%  after  220  seconds 
while  that  without  PCMs  has  a  maximum  of  17.9%  after  179  seconds.  This  represents  an 
improvement  in  diffraction  efficiency  of  4.2%.  Both  curves  tend  to  a  diffraction  efficiency  of 
about  16.5%  at  longer  exposure  times  of  9  mins  so  to  achieve  the  highest  efficiencies  the 
exposure  time  must  be  set  correctly.  The  curves  do  not  appear  to  fit  well  to  a  model  based  on 
the  usual  inverse  exponential  saturation,  but  simply  for  the  sake  of  comparison,  we  have 
calculated  the  writing  time  constants  assuming  the  maximum  refractive  index  modulation  is 
achieved  in  each  case  at  the  maximum  diffraction  efficiency  using  the  same  method  as  that  used 
in  reference  8  but  assuming  that  it  is  a  function  of  the  total  writing  power.  This  gives  a  writing 
time  constant  of  30.4s  for  the  curve  using  PCMs  and  42.2s  for  the  curve  without.  This 
experiment  was  repeated  several  times  giving  similar  improvements  in  diffraction  efficiency. 
The  increased  rate  of  growth  of  diffraction  efficiency  also  observed  in  each  case  may  be  due  to 
the  increased  illumination  arising  from  use  of  the  PCMs.  This  is  beneficial  as  multiple 
multiplexed  recordings  can  be  made  in  a  shorter  time.  The  increased  maximum  diffraction 
efficiency  may  be  due  to  the  better  refractive  index  modulation  uniformity  with  depth. 
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Fig.  2  The  growth  of  holographic  grating  diffraction  efficiency  with  time,  curve  1  uses  conventional  recording, 
curve  2  uses  the  novel  phase  conjugate  mirror  recording  configuration. 

Fig  3  shows  the  experimental  layout  of  our  multimode  phase  conjugate  resonator  which 
incorporates  a  set  of  angularly  multiplexed  gratings  stored  by  the  aforementioned  technique, 
within  the  cavity.  In  order  to  store  several  holographic  gratings  it  is  necessary  to  use  angled 
reference  beams  over  a  large  angular  range  since  there  must  be  a  minimum  angular  change 
between  holographic  recordings  in  order  to  reduce  crosstalk.  Consider  the  case  of  15  beams 
with  2.5mm  diameter  beam  widths,  as  an  example.  The  power  entering  the  first  PCM  (shown 
on  the  left  of  figure  3)  immediately  after  recording  and  on  initial  playback  was  0.3mW.  After 
phase  conjugation  the  power  was  20|U.W  giving  a  11.4%  reflectivity  (neglecting  Fresnel  losses) 
which  is  less  than  the  35%  measured  earlier  using  a  single  beam.  The  holographic  gratings  were 
not  fixed  and  so  were  gradually  erased  by  the  reading  beam.  This  can  be  clearly  seen  in  figure  4 
which  shows  the  optical  power  phase  conjugated  by  the  first  PCM  as  measured  after  the  50% 
beam  splitter.  Zero  time  corresponds  to  when  all  of  the  15  green  beams  illuminated  the  phase 
conjugate  barium  titanate  crystal.  After  one  minute  a  red  (He-Ne)  inducing  beam  was  switched 
on  to  illuminate  the  crystal  with  a  power  of  SOjitW  at  an  angle  of  60  degrees,  a  distance  of  1mm 
from  the  corner  of  the  crystal.  The  curve  clearly  shows  that  the  phase  conjugated  power 
suddenly  increases  from  zero  to  20  jiW  in  2.5  mins.  At  this  time  the  inducing  beam  was  turned 
off  and  the  subsequent  exponential  decay  is  due  to  the  erasure  of  the  holographic  gratings  in  the 
lithium  niobate.  After  25  mins  when  the  power  had  decayed  to  4  |xW  it  suddenly  dropped  to  0.3 
P-W.  This  indicates  the  thresholding  effect  of  the  PCM.  Figure  5  shows  the  15  simultaneously 
phase  conjugated  beams.  We  observed  each  eigenmode  of  the  system  as  a  spot  of  light.  It  was 
confirmed  that  the  system  was  resonating  by  chopping  all  of  the  beams  in  front  of  each  PCM  in 
turn.  If  the  spot  was  extinguished  when  each  set  of  beams  were  blocked  then  this  confirms  that 
signals  are  circulating  around  the  system.  The  CCD  camera  showed  a  large  spot  of  light  with 
detail  multimode  structure. 

3.  Conclusion 

In  conclusion,  this  paper  introduces  a  novel  recording  technique,  in  which  two  crystals  of 
barium  titanate  are  used  as  phase  conjugate  mirrors  during  recording  to  enhance  the  resulting 
holographic  diffraction  efficiency.  We  demonstrated  a  phase  conjugate  resonator  having  fifteen 
modes  incorporating  an  angularly  multiplexed  set  of  gratings  recorded  using  our  novel 
technique.  Finally,  we  demonstrated  phase  conjugation  of  fifteen  beams  by  means  of  an 
inducing  beam  of  another  wavelength  (non-degenerate  multi-beam  induced  phase  conjugation). 
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Fig.  3  Multimode  phase  conjugate  resonator. 
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Fig.  4  Total  optical  power  phase  conjugated  by  the  first  phase  Fig.  5  Fifteen  naodes  simultaneously 
conjugate  mirror  rises  abruptly  when  the  inducing  beam  (IB)  is  supported  in  the  phase  conjugate 
switched  on  and  then  decays  exponentially.  resonator  as  observed  on  screen  S  in 

figure  3. 
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Abstract.  Reversal  input  superposing  technique  is  applied  to  an  optical  learning  neural  network. 
Optical  neural  networks  introduced  the  technique  are  not  necessary  to  use  negative  weights 
and  subtraction,  and  inherently  constructed  all  optical  systems. 


1.  Introduction 

In  general  neural  network  models,  the  synaptic  weights  and  the  weighted  sum  are  the  finite 
real  values.  It  means  that  some  of  them  have  minus  values.  In  the  other  hand,  to  implement 
optical  neural  networks,  usually  the  output  of  neurons  and  the  values  of  weights  are  represented 
by  optical  intensities,  transmittances  or  refrectivities.  It  is  hard  to  realize  minus  values 
under  these  representations.  Therefor,  in  some  optical  neural  network  systems,  the  twice 
weights,  separated  plus  and  minus  weights  are  used.  After  each  weighted  sum  operation, 
these  values  are  subtracted  by  electronic  circuits [1,2].  In  this  approach,  twice  pixels  of 
weight  matrix  and  electronic  subtract  circuits  are  required  and  the  system  speed  is  limited 
by  these  circuits.  In  another  case,  the  weights  are  added  with  a  bias  and  the  thresholds  of 
each  neuron  are  changed  according  to  the  total  values  of  inputs.  In  this  case  special  hardwares 
to  realize  controlling  the  each  threshold  are  needed,  and  an  all  optical  system  is  impossible. 

We  proposed  the  reversal  input  superposing  technique  (RIST)  to  avoid  above 
problems[3,4].  In  this  paper,  we  report  an  optical  learning  neural  network  with  RIST.  The 
optoelectronic  neural  system  based  on  RIST  is  implemented  and  learning  capability  is 
realized  by  using  a  Pockels  readout  optical  modulator  (PROM)  device[5]. 


2.  Principle  of  RIST 

Consider  general  discrete  neural  network  models  consisting  of  M  neurons,  which  receive 
the  same  input  signals  X  =  {xp...,x^,...,X[,j)  from  N  input  neurons,  and  emit  respective  output 
signals  V  =  {Vi,...,Vj,...,VM}.  Let  Wj^  be  the  synaptic  weight  from  the  i-th  input  neuron  to  j-th 
output  neuron.  The  j-th  weighted  sum  of  input  signals  Uj  and  the  output  v-  are  written  as 


(1) 


and 


v,  =  f(up. 


(2) 
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Input  Plane  Weighted  InterconncGtions 


Output  Plane 


Fig.l  Schematic  block  diagram  of  RIST  in  a  neural  network. 

where  is  the  j-th  value  of  a  set  of  the  thresholding  values  H  =  {hp...,hj,...,hM)  and  f  is  a 
nonlinear  output  function.  Introducing  a  constant  a,  Eqs.(l)  and  (2)  are  rewritten  as 

Uj  =  Xi(Wj.  +  a)x.  +  al^(l  -  x.)  (3) 

and 

v-f(u'p,  (4) 

where  f(y)  =  f(y-Na).  The  thresholding  values  H  =  {h^}  are  neglected  because  they  can  be 
treated  as  same  as  one  of  the  synaptic  weights.  The  constant  a  is  a  bias  of  the  weights  Wj. 
and  is  determined  to  a  >  -min(Wji),  where  min(w^)  denotes  the  minimum  values  of  a  set  of 
the  weights  W  =  {Wj^j.  Hence,  the  biased  synaptic  weights  {w^+a}  of  the  first  term  in 
Eq.(3)  are  necessary  to  be  positive.  The  input  x.  is  ranged  from  0  to  1  in  many  neuron 
models.  In  these  cases,  {1-x.}  of  second  term  in  ^.(3)  are  the  reversal  values  of  the  inputs 
X-  in  the  range  of  0  to  1,  and  positive.  Under  these  conditions  all  terms  of  Eq.(3)  are  all 
positive  and  easily  realized  by  optical  techniques,  f  is  the  nonlinear  function  with  the 
thresholding  value  which  is  slid  the  constant  Na  from  f  in  Eq.(2). 

Figure  1  shows  a  schematic  block  diagram  realizing  Eq.(3).  The  lower  path  (PATHl) 
of  the  diagram  performs  a  weighted  sum  operation  which  is  the  first  term  of  Eq.(3)  with 
input  signals.  The  upper  path  (PATH2)  makes  reversal  inputs  of  the  second  term.  The 
calculated  results  on  the  output  plane  are  applied  to  the  devices  which  perform  the  nonlinear 
function  with  the  slid  thresholding  value. 


3.  Optical  Implementation  and  Experiments 

We  apply  polarization  encoding  method  in  order  to  get  reversal  inputs.  The  polarization 
encoded  patterns  and  two  orthogonal  analyzers  are  easily  made  the  input  and  reversal 
patterns  simultaneously. 

The  experimental  system  of  an  optoelectronic  neural  network  with  RIST  is  shown 
in  figure  2.  He-Ne  laser  light  (633nm)  is  used  at  the  recalling  process.  Input  signals  of  input 
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Beam  Splitter  Polarizer  PBS:  Polarization  Beam  Splitter 


Image  Processor  &  Computer 


Fig.2  Experimental  setup  of  the  optical  learning  neural  network  with  RIST. 

neurons  are  polarization  encoded  by  a  liquid  crystal  TV  (LCTV)  panel  which  has  not  an 
analyzer.  After  the  polarization  beam  splitter  (PBSl),  input  patterns  pass  a  PROM  device 
which  displays  the  weights  and  weighted  sum  operations  are  performed  (PATHl).  On  the 
other  hand,  reversal  patterns  pass  another  arm  (PATH2)  and  superposed  with  the  calculated 
results  at  PATHl.  The  superposed  results  are  detected  by  a  CCD  camera.  The  balance  of 
these  signals  are  adjusted  by  controlling  the  angel  of  a  half  wave  plate  in  PATH2.  The 
nonlinear  function  with  a  slid  threshold  value  are  calculated  in  a  computer. 

To  achieve  the  learning  capability  on  the  system,  a  PROM  device  is  located  in 
PATHl.  The  PROM  is  written  or  erased  by  white  light  according  to  the  values  which  are 
displayed  on  the  LCTV.  Therefor  it  is  not  necessary  to  align  the  weight  matrix  on  the 
PROM  and  input  signals  on  the  LCTV.  The  modification  values  of  the  weights  are  calculated 
from  the  teaching  signals  and  the  recalling  results  by  a  computer  and  displayed  on  the 


Input  Neuron  Output  Neuron 


lnput_l={0, 1,0, 1,0.1, 0,1) 


Input_2={0,0,l,0,l,l,l,0) 


Input_3={l,0,l,l,l,0,0,0} 


Fig.3  (a)  8-8  neuron  network  structure  and  (b)  learning  patterns  (8  neuron,  3  patterns). 
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Iteration 


Fig.4  Learning  curves  of  each  stored  pattern  and  the  total  error. 


LCTV  through  an  image  processor,  and  are  written  and/or  erased  on  the  PROM. 

On  the  experimental  system,  a  single  layer  structure  with  8  input  and  8  output 
neurons  is  realized  (Fig. 3 (a)).  3  input  patterns  of  8  bits  are  stored  on  the  system  according 
to  the  orthogonal  learning  rule  (Fig. 3(b)).  The  weights  on  the  PROM  start  from  random 
values  nearby  the  constant  a.  Here,  we  determine  a  with  the  half  value  of  the  dynamic 
range  of  the  device.  The  learning  curves  are  shown  in  figure  4.  By  13  iterations,  the 
weights  went  to  correct  values  and  the  error  went  down  to  0. 


4.  Conclusion 

Reversal  input  superposing  technique  is  applied  to  an  optical  learning  neural  network. 
Optical  neural  networks  introduced  the  technique  are  not  necessary  to  use  negative  weights 
and  subtraction.  The  learning  capability  was  applied  to  the  optoelectronic  system  using  a 
PROM  device  and  verified  at  the  performance  with  an  8-8  neurons  network. 

The  authors  thank  S.lshihara,H.Yajima  and  T.Hidaka  of  Electrotechnical  Laboratory 
for  useful  discussions  and  encouragements.  Thanks  are  also  due  to  Y.Osugi  of  NGK  Insulators, 
Ltd.  for  providing  the  PROM. 
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Abstract.  A  new  architecture  combined  with  an  optical  Fourier  transformational 
processor  and  a  three-layer  optical  neural  network  proposed.  This  hybrid  system 
can  reduce  a  number  of  neural  elements. 

1.  Introduction 

Optical  neural  network  computing  is  of  interest  in  terms  of  massively 
parallel  computing.  Many  optical  neural  network  systems  based  on  the  matrix- 
vector  multiplication  architecture,[l]  however,  are  limited  in  a  scale  of  parallelism, 
because  of  spatial  resolution  of  optical  systems  and  spatial  light  modulators. 
Besides,  when  the  bipolar  synaptic  weights  are  represented  by  optical  intensity, 
they  are  divided  into  positive  unipolar  weights  and  negative  unipolar  weights[2],[3]. 
This  means  that  twice  number  of  the  synaptic  weights  are  needed.  To  solve  these 
problems,  as  an  alternative  method,  we  introduce  optical  feature  extraction 
techniques.  The  feature  extraction  can  map  the  data  onto  a  new  feature  space. 
The  data  on  the  feature  space  are  subjected  to  input  data  for  an  optical  neural 
system.  If  we  can  design  a  suitable  mapping  system,  the  ability  ,  the  function, 
and  the  size  of  the  optical  neural  system  could  be  reduced. 

We  propose  here  a  new  type  of  neural  computing  with  a  preprocessor  for 
feature  extraction  to  reduce  the  number  of  neural  units.  A  variety  of  optical 
feature  extraction  techniques  have  been  discussed,  which  include  analog[4]  and 
digital  methods[5].  The  Fourier  transform,  the  Hough  transform,  the  polar 
coordinate  transform,  the  scale-invariant  transform  and  so  on  are  typical  examples 
of  analog  methods.  In  this  paper,  we  use  a  Fourier  Transform  optical  system  as  a 
preprocessor,  because  of  its  data  compression  and  shift-invariant  properties. 


2.  Feature  Extraction  and  Hybrid  Computing 


Figure  1  shows  a  concept  of  a  hybrid  optical  neural  network  using  the 
Fourier  transform  which  is  one  of  feature  extraction  preprocessing.  We  note  that 
a  Fourier  transform  of  a  real  object  is  shift  invariant  and  point-symmetrical.  A 
arbitrary  upper  half  region  of  the  Fourier  transfomied  image  is  connected  with 
positive  synaptic  weights,  the  other  region  negative  synaptic  weights.  This 
procedure  reduces  half  the  number  of  weights  to  display  on  a  spatial  modulator. 


Shift  Invariant 

Input  Real  Image  Rotation  Invariant  I 


Scale  Invariant 
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Fig.  1.  Concept  of  a  hybrid  optical  neural  computing, 

3.  Computer  Simulation 

Computer  simulation  for  the  feature  extraction  preprocessing  and  the  neural 
network  with  the  back-propagation  learning  algorithm  is  performed.  We  consider 
a  three-layers  neural  network  consisting  of  64x64  input  layer  neurons,  64  hidden 
layer  neurons,  and  5  output  neurons,  to  recognize  26  alphabet  characters  of 
64x64  pixels. 


0  10  20  X  40  50  60  0  lO  20  30  ^  50  60 

Fig.  2  Input  character  example  "A".  Fig.  3  FT  of  the  character  "A", 
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Fig.  4  An  arbitrary  region  of  transformed  patters  . 


Figure  2  shows  an  example  of  inputs  of  26  alphabet  characters.  These 
characters  are  represented  by  a  gray  scale  of  8bit.  At  first,  we  calculate  Fourier 
transforms  of  input  characters.  Figure  3  shows  a  Fourier  transformation  of  the 
character  of  Fig.  2.  Transformed  patterns  are  normalized,  and  they  are  also 
represented  by  a  gray  scale  of  8bit.  Because  of  the  point  symmetry  of  Fourier 
transformed  images,  an  arbitrary  half  part  of  transformed  patters  as  shown  in 
Fig.  4  are  used  for  feature  extraction. 

Figure  5  shows  the  change  of  the  squared  sum  error  of  26  characters  vs 
iteration  times  of  learning  in  the  cases  when  inputs  for  the  neural  network  are; 

(1)  Original  character  images,  (Fig.2) 

(2)  Fourier  transformed  images,  (Fig.3) 

(3)  Arbitrary  half  of  Fourier  transformed  images.  (Fig.4) 
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Fig.  5  Change  of  the  squared-sum  error. 
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The  conventional  optical  neural  networks  are  not  able  to  recognize  position 
shifted  input  patterns.  Because  of  the  shift  invariance  in  the  Fourier  Transform, 
we  have  estimated  the  recognition  ability  for  non-shifted  and  for  shifted  input 
images  using  the  values  of  weights  and  offsets  which  are  obtained  when  the 
learning  is  completed.  Figure  6  shows  the  error  rates  of  recognition  for  each 
input  pattern  with  and  without  random  shift  within  4  pixels.  The  results  indicate 
reduction  of  the  error  rates  due  to  shift  invariance  of  the  Fourier  transform. 


4,  Conclusion 

We  propose  a  new  type  of  neural  computing  with  a  preprocessor  for  feature 
extraction.  The  Fourier  transfonn  of  a  real  object,  which  is  one  of  feature  extraction 
preprocessing,  is  shift  invariant  and  point-symmetrical.  The  feature  extraction 
preprocessing  we  use  here  has  two  advantages;  reduction  of  the  number  of  neuron 
units  and  shift-invariance  for  input  patterns. 
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Abstract.  We  present  an  optical  implementation  of  the  recognition  stage 
of  a  Kohonen-like  neural  network  based  on  a  supervised  learning  algorithm. 
The  set  up  includes  Diffractive  Optical  Elements  to  both  store  and  address 
the  weights. 


1.  Introduction 

Our  work  involves  the  study  of  optical  implementation  of  neural  networks.  We  describe 
and  show  the  feasibility  of  the  optical  implementation  of  the  recognition  stage  of  a 
neural  network  based  on  Kohonen  feature  maps.  Indeed,  this  stage  can  be  understood 
as  a  multichannel  correlation  and  thus  is  well  suited  to  an  optical  implementation.  The 
learning  stage  is  done  numerically.  The  set  up  uses  two  binary  phase  holograms  :  a 
16x16  microlens  array  to  achieve  the  replication  of  the  Fourier  transform  of  the  inputs 
and  a  Computer  Generated  Hologram  (CGH)  that  contains  the  weights  of  the  computed 
Kohonen  map. 


2.  Learning  Algorithm 

We  use  a  supervised  learning  algorithm  based  on  Kohonen  self-organizing  feature  maps 
for  pattern  recognition.  While  in  the  classical  Kohonen  algorithm  the  map  is  self- 
organized  by  the  inputs  according  to  their  probability  density  function,  pattern  recogni¬ 
tion  applications  need  some  supervision  to  map  similar  stimuli  in  terms  of  their  context 
and  not  in  terms  of  a  known  vector  distance  metric.  We  adopt  here  an  idea  previously 
used  for  semantic  map  organization  [1]  adapted  to  pattern  recognition  [2].  The  basic 
motivation  is  to  organize  the  map  by  the  patterns  and  their  association  targets  simulta¬ 
neously  (eg- in  our  case  the  class  which  the  input  pattern  belongs  to).  A  by-product  of 
this  process  is  that  the  class-labelling  of  neurons  on  the  map  emerges  during  the  learning 
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phase,  while  in  the  classical  Kohonen  maps  the  labelling  is  achieved  after  the  learning 
stage. 

We  present  in  figure  1  the  feature  map  used  for  our  experiments.  It  shows  a  8x8 
neuron  map  ;  for  each  neuron,  a  16x16  pixel  image  shows  the  weights  between  the 
input  layer  and  the  neuron  itself,  the  intensity  of  each  weight  being  proportionnal  to  its 
darkness.  This  representation  exhibits  several  properties  of  the  model  : 

-  Each  neuron  is  specialized  ;  hence,  it  is  possible  to  “recognize”  patterns  on  the 
map. 

-  Inside  each  class,  the  great  differences  between  input  patterns  induce  the  fact 
that  several  neurons  belong  to  this  class  while  the  small  variations  are  taken 
into  account  by  the  fact  that  the  weights  are  calculated  in  such  a  way  that  the 
patterns  on  the  map  appear  as  “blurred”  images. 

-  There  is  a  topology  on  the  map  since  all  the  neurons  belonging  to  one  class  are 
close  to  each  other. 

-  One  can  observe  the  clustering  of  the  neurons  on  the  map  according  to  the 
classes. 

The  learning  database  includes  750  patterns  of  an  actual  French  postal  code 
database  which  was  already  segmented.  Each  pattern  is  a  16x16  binary  pixel  image. 


Figure  1.  8x8  feature  map  used  for  our  experiments  (see  the  clustering  of 
the  map  according  to  the  classes). 


3.  Numerical  Studies 

The  recognition  stage  and  its  application  for  handwritten  digit  recognition  have  been  nu¬ 
merically  implemented.  More  precisely,  two  points  relevant  to  an  optical  implementation 
were  studied  ; 

-  while  the  selection  of  a  neuron,  for  classical  Kohonen  feature  maps,  is  made 
by  minimizing  a  distance  between  the  input  and  the  weights,  it  is  easier  for  an 
optical  version  to  maximize  an  inner  product.  Those  two  conditions  are  strictly 
equivalent  under  the  assumption  that  the  weights  and  the  inputs  used  for  the 
learning  stage  are  normalized. 
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-  another  point  of  matter  is  to  study  the  influence  on  the  results  during  the  recog¬ 
nition  stage  when  using  a  binary  hologram  instead  of  an  actual  weights  map. 

The  studies  were  done  using  a  1024x1024  binary  CGH  that  was  computed  using 
mainly  an  error  diffusion  algorithm.  They  show  for  a  50  pattern  test  database  (5  patterns 
for  each  class)  the  validity  of  the  computation  of  the  CGH  and  the  feasibility  of  the  optical 
implementation. 

4.  Experimental  Work 

For  our  set  up,  the  size  of  the  inputs  is  16x16  pixels  and  the  Kohonen  map  has  8x8  neurons 
so  that  256x64  weights  must  be  coded  using  a  1024x1024  pixel  off-axis  hologram.  For  the 
classification  stage  that  requires  64x10  interconnections  a  micro-computer  is  used.  This 
set  up  is  a  slightly  improved  version  of  a  previously-used  one  for  the  implementation 
of  an  associative  memory  [3]  [4].  It  is  specially  designed  to  achieve  a  multiplexing  of 
connection  holograms  involving  a  special  coding  scheme  for  the  weights  matrix  that  we 
called  Frequency  Multiplexed  Raster  (FMR).  This  scheme  allows  one  to  encode  a  4D 
matrix  on  a  plane.  Moreover,  for  an  optical  implementation,  it  leads  to  a  standard 
spatial  filtering  set  up  except  that  the  inputs  and  outputs  must  be  sampled  in  a  special 
manner.  The  holographic  filter  contains  the  FMR  encoded  synaptic  array. 

An  holographic  element  (microlens  array  or  Dammann  gratings  for  instance)  en¬ 
ables  the  achievement  of  the  sampling  scheme  of  the  inputs.  The  first  Fourier  lens  of  the 
set  up  achieves  the  Fourier  Transform  (FT)  of  the  so-sampled  input  and  allows  one  to 
replicate  the  FT  of  each  input  pattern.  This  component  and  the  connection  hologram  are 
binary  phase  elements  processed  by  current  means  :  contact  lithography  using  e-beam 
designed  masks  and  reactive  ion  etching  of  a  silica  substrate.  This  technological  part  of 
our  work  was  done  by  the  laboratories  of  the  CNET-Bagneux,  France. 

The  output  image  diffracted  by  the  connection  hologram  is  detected  using  a  CCD 
#  camera.  The  image  measures  the  activation  of  the  neurons  for  a  given  input.  Since 
each  activation  is  the  inner  product  of  the  input  and  one  neuron  (whose  coordinates 
are  its  weights  since  it  is  fully  connected  to  the  input),  the  result  can  be  seen  as  the 
correlation  peak  of  the  input  and  each  of  the  patterns  shown  on  figure  1.  Hence,  this 
optical  implementation  can  also  be  understood  as  a  multichannel  correlator  except  that 
the  filter  is  computed  in  such  a  way  that  the  results  for  all  the  channels  are  obtained 
using  a  simple  Fourier  lens. 

For  the  classical  Kohonen  algorithm,  a  special  kind  of  non-linearity  called  “Winner- 
Takes- All”  is  used  which  means  that  only  the  most  active  neuron  is  retained.  In  our  case, 
the  difference  between  neuron  activation  is  first  emphasized  by  quadratic  detection.  The 
decision  is  then  made  by  computer  and  suitable  software. 

We  present  in  figure  2  two  experimental  results  for  two  different  O’s.  The  most 
important  activations  of  the  neurons  are  in  the  “right”  part  of  the  map  (see  figure  1  for 
the  references)  showing  that  these  results  are  in  good  agreement  with  what  was  expected. 

5.  Conclusion 

As  a  conclusion,  we  present  results  for  three  steps  of  an  optical  implementation  of  a 
Kohonen-like  neural  network  :  software  implementation  for  the  learning  algorithm,  nu¬ 
merical  simulations  for  the  optical  recognition  step  and  experimental  results.  The  results 
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(a)  (b) 


Figure  2.  Experimental  results  (b)  and  (d)  for  two  different  inputs  (a)  and 
(c).  The  activations  are  to  be  compared  with  the  feature  map  of  the  figure  1. 


we  get  with  actual  postal  code  patterns  show  the  validity  of  our  encoding  scheme  and 
prove  the  possibility  of  achieving  any  kind  of  interconnections  between  two  planes. 

We  are  still  working  on  the  experimental  part  of  this  work  in  order  to  improve, 
in  terms  of  dynamic  range,  the  quality  of  the  storage  of  the  data  for  the  connection 
hologram. 
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Abstract.  Noise  will  limit  the  size  and  speed  of  optoelectronic  neural  networks.  The  pattern 
recognition  performance  of  an  experimental  network  was  measured  against  signal-to-noise  ratio. 
Accuracy  was  improved  by  redundancy  in  the  network  configuration. 


1.  Introduction 

The  computing  power  that  will  be  demanded  by  many  future  applications,  for  example  the 
control  of  future  packet-based  telecommunications  switches  or  searches  through  massive 
databases,  is  likely  to  grow  beyond  the  capacity  of  conventional  methods.  Some  of  these 
tasks  are  amenable  to  neural  processing  and  could  therefore  be  accelerated  by  parallel 
operation  on  appropriate  hardware.  Optoelectronic  networks  could  further  expand  the  scale  of 
parallel  systems,  but  they  must  also  be  fast.  Otherwise  the  extra  parallelism  will  just 
compensate  for  lost  speed,  rather  than  extending  performance  beyond  the  bounds  electronics. 

In  high-speed  optical  systems  receiver  noise  becomes  dominant;  hence  this  paper  first 
examines  in  general  how  noise  limits  network  size  and  then  how  it  affects  performance  in  a 
particular  instance.  The  effects  of  introducing  redundancy  into  the  network  configuration  have 
been  modelled  and  confirmed  experimentally.  The  example  problem  was  binary  pattern 
recognition,  chosen  both  for  ease  of  testing  and  because  it  resembles  some  simple  operations 
on  packet  headers. 


2.  Noise  limit  to  netvt^ork  size 

Consider  a  network  with  fan-out  from  a  modulator  array  to  a  weight  matrix  and  fan-in  to  a 
detector  array,  with  the  total  optical  power  evenly  divided  between  all  the  connections.  The 
detection  circuits  must  be  able  to  resolve  the  smallest  significant  change  in  the  value  of  one 
weight,  so  the  maximum  number  of  connections  that  can  be  supported  is: 

where  M  and  N  are  the  numbers  of  modulators  and  detectors,  S  is  the  total  optical  power 
available,  F  is  the  optical  efficiency,  h  is  the  number  weight  levels  and  P^es  is  the  smallest 
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Fig.  1.  Network  size  /  data  rate  with  100  weight  levels.  T  is  overall  optical  efficiency. 

resolvable  optical  power.  was  calculated  for  a  state-of-the-art  communications  receiver 
assuming  an  error  rate  of  10'^,  but  with  a  large  area  photodiode  (7  pF)  and  a  high  mean  level 
of  illumination  (lOOxF^^^).  F  =  Tix^w^,  where  and  are  the  usable  transmittance  ranges 
of  the  modulators  and  mask  respectively  and  rj  is  the  net  efficiency  of  the  other  components. 
In  the  experimental  system,  T]  -  0.3,  x^  =  0.12,  and  =  0.8,  giving  F  =  0.03.  Network 
size/unit  power  was  plotted  against  data  rate  for  this  value  of  F  assuming  100  weight  levels 
(Fig.  1).  Two  more  curves  were  plotted,  for  ideal  modulators  or  a  laser  array  {x^~l,  hence F 
=  0.24)  and  for  an  ideal  system  (F  =  1).  The  results  indicate  that  a  large  network  (MN  > 
1000,  say)  could  run  at  speeds  >  100  MHz  powered  by  a  semiconductor  laser  source, 
provided  that  an  overall  optical  efficiency  of  0.24  or  better  were  attained. 


3.  Pattern  recognition  theory 

A  single  neuron  is  sufficient  for  binary  pattern  recognition  if  its  connection  weight  to  each 
expected  one  in  the  input  pattern  is  + 1 ,  and  to  each  expected  zero  is  - 1 .  But  with  large  patterns 
the  threshold  would  be  high  and  the  weight  tolerances  tight,  so  in  the  experimental  system 
margins  were  improved  by  dividing  the  target  into  sub-patterns  with  a  neuron  to  recognise 
each.  Some  redundancy  was  intrixiuced  by  having  two  overlapping  sets  of  three  sub-patterns 
and  arranging  a  second-layer  neuron  to  respond  whenever  more  than  four  of  the  six  sub¬ 
patterns  were  detected. 

By  assuming  that  the  receiver  noise  at  its  input  is  gaussian,  the  probability  that  a  neuron 
will  be  on  may  be  calculated  (Fig.  2).  Each  first- layer  neuron  has  an  initial  probability  of  0.5 
which  tends  asymptotically  to  1  or  0  as  the  signal-to-noise  ratio  is  increased,  according  to 
whether  or  not  the  correct  sub-pattern  is  present.  The  on  probability  of  the  output  neuron  first 
falls  below  0.5  even  with  the  target  pattern  before  beginning  to  increase  with  signal-to-noise 
ratio.  This  is  because  the  output  neuron  is  susceptible  both  to  errors  induced  by  the  noise  at  its 
own  input  and  to  errors  occurring  in  the  first-layer.  For  signal-to-noise  ratios  above  1.6, 
however,  the  majority  vote  system  makes  the  output  more  accurate  than  the  individual  first- 
layer  neurons.  The  output  probability  was  also  calculated  for  patterns  close  to  the  correct 
target.  Patterns  differing  by  just  one  bit  are  more  likely  to  be  misclassified  than  is  the  target, 
but  discrimination  could  be  improved  with  a  greater  degree  of  redundancy  (more  sub-pattern 
overlap). 
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4.  Pattern  recognition  with  an  optoelectronic  network 

The  experimental  system  [1]  used  to  test  these  results  employed  an  array  of  high-speed  multi¬ 
quantum -well  modulators  [2]  as  inputs  (Fig.  3).  They  were  illuminated  by  a  single  1.5  pm 
semiconductor  laser  using  a  computer-generated  hologram  [3]  to  divide  the  beam.  A  second 
hologram  provided  fan-out  to  the  weight  mask.  Weights  to  recognise  a  target  pattern  were 
calculated  and  encoded  on  photographic  film.  Fan-in  was  achieved  by  imaging  the  weight 
mask  onto  a  detector  array  feeding  comparators.  Bipolar  weight  values  were  mapped  to 
unipolar  mask  transmittances  using  a  shared  threshold.  The  mapping  scheme  also  cancelled 
offsets  caused  by  the  limited  contrast  of  the  modulators  and  mask  and  required  only  a  small 
increase  in  network  size.  Because  the  neuron  thresholds  were  part  of  the  weight  mask  rather 
than  voltages  applied  to  the  comparators,  the  laser  power  could  be  adjusted  to  control  the 
signal-to-noise  ratio  while  leaving  the  relative  values  of  all  neuron  inputs  unchanged. 

A  stream  of  input  patterns  including  the  target  was  presented  to  the  network  at  8.4 
Mword/s.  The  frequencies  of  correct  recognitions  of  the  target  and  the  six  sub-patterns  were 
measured  as  laser  power  was  increased  (Fig.  4).  As  predicted,  the  performance  of  the  whole 
network  exceeded  the  average  of  its  first-layer  neurons  for  input  powers  over  2  mW.  Then 

detector  array 


Fig.  3.  Experimental  system.  The  beams  corresponding  to  one  input  only  have  been  indicated. 
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Fig.  4.  Measured  output  and  first-layer  recognition  frequencies  /  laser  power. 

responses  to  other  patterns  close  to  the  target  were  recorded  (Fig.  5).  False  recognition 
frequencies  were  also  broadly  as  predicted,  only  a  few  patterns  caused  significant  numbers  of 
false  recognitions  at  full  power,  probably  because  of  inaccuracies  in  weight  fabrication. 

5.  Conclusions 

Networks  large  and  fast  enough  to  challenge  electron! :  systems  will  operate  close  to  the  noise 
limit  if  powered  by  semiconductor  sources,  as  they  must  be  to  allow  compact  integrated 
systems.  However,  this  work  demonstrates  that  redundancy  in  the  network  can  reduce  the 
frequency  of  noise-induced  errors. 
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Fig.  5.  Measured  recognition  frequencies  with  target  and  other  patterns. 
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Abstract.  A  general  neural  unit  with  only  a  complex-valued  matrix  depicting  the  integrated 
synaptic  connection  and  lateral  interaction  is  proposed.  Various  connections  can  be  self-routed 
by  the  routing  code  contained  in  the  input.  Based  on  the  mixed  negabinary  system,  an  inner- 
product  algorithm  is  developed  to  calculate  the  complex  matrix-vector  multiplication  without 
carries,  signs,  and  decimal  points.  An  incoherent  optical  correlator  with  spatial  coding  is 
suggested  to  execute  the  algorithm  in  parallel. 


1.  Introduction 

The  biological  neurons  receive  input  connections  from  different  sources:  external 
input  signals  through  the  incoming  synaptic  connections,  and  lateral  interaction 
within  the  same  area[l].  And  various  neural  network  model  were  developed[2]. 
Recently,  a  general  neural  unit  was  suggested  as  a  building  block  [3],  which  is 
a  cascade  of  a  synaptic  connection  layer  and  a  lateral  interaction  layer. 

In  this  paper,  we  suggest  that  the  combined  stimuli  states  of  neuron  are 
represented  by  complex  numbers:  the  synaptic  connections  are  indicated  by  real 
numbers;  the  lateral  interaction  by  imaginary  numbers;  and  the  integrated  effect 
by  a  pair  of  routing  code.  Thus,  a  general  neural  unit  with  only  a  complex¬ 
valued  matrix  is  established  to  calculate  the  integrated  connections.  Thus,  various 
connections  can  be  self-programmed  by  the  routing  code  contained  in  the  inputs. 
To  deal  with  the  complex-valued  and  multi-valued  matrix-vector  multiplication, 
we  develop  a  digital  inner-product  algorithm  on  the  basis  of  the  real  and 
imaginary  decomposition  and  the  mixed  negabinary  system.  It  has  the  features 
as  no  carries,  no  signs,  no  decimal  points,  and  simple  pre-  and  post-processing. 
Correspondingly,  an  optical  architecture  of  incoherent  optical  correlation  with 
spatial  digital  coding  is  suggested,  which  can  execute  the  algorithm  in  parallel. 

In  the  previous  optical  neural  networks[4,5],  the  analog  and  bipolar  matrix- 
vector  multiplications  were  usually  utilized.  In  optical  matrix  processing,  various 
digit  systems  and  architectures  were  used  to  improve  the  accuracy  [6-9],  but  they 
have  the  problems  on  carries,  signs,  decimal  points,  and  pre-  and  post-processing. 
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2.  Self-routing  general  neural  unit 


The  complex-valued  general  neural  unit  is  shown  in  Fig.L  The  states  of  the 
input  signals  are  denoted  by  a  complex-valued  vector,  [Vj(Kj+jK2)].  Vje{l,-1}  is 
the  input  stimuli  and  K,,  K2g{1,0,-1}  assemble  the  routing  code.  The  complex¬ 
valued  matrix  consists  of  the  real  and  the  imaginary  part  as  [T^ij]+j[T\j].  The  real 
and  imaginary  parts  of  the  resulting  vector  are 


where  [v,"']  =[?;.;] [v.J  and  [v/]  =[?;;] [v,]. 


(1) 


By  using  different  routing  code  the  resulted  real  and  imaginary  parts  both  have 
nine  connection  functions  (see  Table  1).  Thus  either  one  can  be  selected  as  the 
outputs.  In  the  figure,  the  imaginary  outputs  are  used  as  the  terminals. 


Figure  2 


Table  1.  Self-routed  connection  functions 


K, 

[v’M=K,[vM-Kav'] 

[v’’]=K3[v^]+K.[v‘] 

1 

I 

(1) 

(10) 

0 

[v^J 

(2) 

[v'J 

(1!) 

-1 

[vM+[v4 

(3) 

-[vM+[v',] 

(12) 

0 

1 

(4) 

[vM 

(13) 

0 

0 

(5) 

0 

(14) 

-1 

[v'J 

(6) 

-IvM 

(15) 

-1 

1 

(7) 

[vM-lv'] 

(16) 

0 

-fvM 

(8) 

[v'] 

(17) 

-1 

-[vM+fv'J 

(9) 

(18) 

The  results  are  then  processed  by  nonlinearity  devices  (NLD)[2].  The  coding 
units  (CUs)  are  used  to  index  the  real  signals  with  the  routing  code  for  the 
cascaded  complex  neural  unit. 


The  complex-valued  general  neural  unit  itself  pictures  a  variety  of  self- 
organized  neural  networks  such  as  the  Hamming  network[2]  and  the  Kohonen 
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network[l].  It  can  be  simplified  to  the  Hopfield  associative  memory[2]  by  a  real 
matrix,  or  to  the  laterally  interconnected  neural  network[l]  or  the  MAXNET 
network[2]  by  a  complex  matrix  with  the  real  part  unitary  matrix.  Furthermore, 
it  is  able  to  represent  multi-layer  networks  by  the  cascade  of  units. 

The  synaptic  matrix  is  denoted  by  (NxM,  M>N)  and  the  lateral 

interaction  matrix  [T’^y]  (NxN).  The  learning  rule  is 

[Ttf-\  =  lTtf]  and  [TJ]  =([7;.^] -[/])[T,7].  (2) 

3.  Negabinary  inner-product  algorithm 

The  negabinary  number  system  has  the  radix  of  -2  [10]: 

a=  E  a„(-2)"  (a„>0),  (4) 

n  =  -L 

Where  the  digit  a^  may  be  a  positive  fraction  in  its  mixed  form. 

The  addition  of  two  real  numbers  is  simply  the  additions  of  their  digits.  For 
the  multiplication,  we  suggest  a  vector  format  to  negabinary  numbers.  The 
multiplicand  is  denoted  by  the  negabinary  vector  of  a  and  the  multiplier  vector 
is  written  in  a  reversed  and  shifted  negabinary  vector  as  b„: 

b  ={b  zr  .,'’\b  ,sb  ,b  ,."\b  .  ,,b 

The  nth  digit  of  multiplication  result  is  the  inner-product  Cj,=a*bn. 

A  complex  number  can  be  decomposed  into  a  real  part  and  an  imaginary  part. 
The  negabinary  multiplicand  vectors  of  the  real  and  imaginary  elements  of  the 
complex  matrix  are  denoted  as  T^(i,j)  and  T^(i,j),  and  the  negabinary  multiplier 
vectors  of  the  inputs  of  K,Vj  and  K2Vj  as  v‘(j)  and  v^(j).  Thus,  the  nth  real  digit 
and  imaginary  digit  at  the  ith  row  of  the  resulting  negabinary  vector  are 

v„''(0  =  E  T\i,j)  -v^O-)  +  \  T\iJ) 

J  I  (6) 

v„'(0  =  E  T\U})  •  v„2(/')  +  ^Kui)  • 

j 

4.  Incoherent  optical  correlation  architecture 

An  incoherent  optical  correlator  with  spatial  coding  is  used  to  implement  the 
complex- valued  and  multi-valued  matrix- vector  multiplication  in  parallel[l  1], 
which  results  in  a  correlation  between  two  coded  masks  similar  to  the 
representation  by  the  negabinary  inner-product  algorithm.  The  conversion  to 
negabinary  numbers,  the  thresholding  necessary  for  neural  nonlinearities,  and  the 
iterative  feedback  are  executed  by  a  PC  computer  (see  Fig. 2). 

In  the  experiment,  we  use  a  laterally  inhibited  Hopfield  model.  The  input 
complex  signal  and  the  complex  matrix  are  coded  as  Fig. 3  and  Fig.4.  The 
predicted  and  experimental  outputs  are  shown  in  Fig.5  and  Fig.6. 
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Figure  5 
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Abstract,  This  paper  is  concerned  on  problem  of  simulation  of  one  neural  network  by  the 
other  one.  Formal  definitions  of  neural  network  and  simulation  are  proposed.  Two  theorems 
are  presented,  the  first  one  shows,  that  each  neural  network  can  be  replaced  by  a  neural  net¬ 
work  with  non-negative  connection  weights  only;  the  second  one  shows,  that  each  neural 
network  can  be  replaced  by  a  neural  network  with  the  connection  weights  equal  0  or  1. 


1.  Introduction 

Artificial  neural  net  models  or  simply  neural  networks  have  been  studied  for  many  years. 
The  models  are  based  on  our  present  understanding  of  biological  nervous  system  and  are 
composed  of  many  nonlinear  computational  elements  -  neurons.  These  neurons  are 
arranged  in  patterns  (or  layers)  and  connected  via  weights  that  are  usually  adapted  during 
use  to  improve  the  net  performance  [1].  Neural  networks  are  finding  many  areas  of 
applications.  Although  they  are  particularly  well-suited  for  applications  related  to  associative 
recall  such  as  content-addressable  memories,  neural  networks  can  perform  many- other 
applications  ranging  from  logic  operations  to  the  solution  of  optimization  problems. 

Simplified  model  of  neuron  was  formulated  in  1943  by  McCulloch  and  Pitts  [2].  In  their 
model,  neuron  is  a  binary  element,  which  has  two  stages;  active  (1)  and  non  active  (0). 
Connection  weight  may  be  positive  (excitatory),  negative  (inhibitory)  or  zero  (no  connec¬ 
tion).  Neuron  becomes  active  only  when  weighted  sum  of  its  inputs  exceed  a  special  value 
called  threshold.  Active  neuron  sends  an  output  signal.  This  paper  is  based  on  McCulloch 
and  Pitts  model. 

Although  artificial  neural  networks  (also  called  the  neurocomputers)  are  much  simpler 
than  biological  ones,  they  have  some  advantages  over  classical  von  Neumann  computers. 
Great  research  effort  is  made  to  implement  neural  nets  in  hardware.  Some  problems  arises 
with  implementation  in  one  network  connections  with  various  weights,  especially  with 
positive  and  negative  weights  [3,4]. 

2.  Definitions 

Neural  network  is  the  triple  (V,E,  P),  where  (V,E)  is  oriented  graph  with  weights,  V  is  the 
set  of  nodes  (neurons),  E  is  connection  weight  function,  E:  VxV  — >  R:  (i.e.  E(x,y)  is  the 
connection  weight  from  neuron  x  to  y ),  and  P  is  threshold  function,  P:V  R,  (i.e.  P(x)  is 
the  threshold  of  node  x  ). 

The  stage  of  the  network  is  the  function  f,  f  V  (0,1 ),  f(x)  is  the  stage  of  neuron  x 
(1  -  active,  0  -  non-active).  Set  of  all  network  stages  is  denoted  by  F  (i.e.  F={f  V^{0,1 } }). 
The  network  changes  its  stage  from  f  to  P  according  to  transformation  rule  T  (i.e. 
P=T(f)),  where  T  is  defined  by 
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[l  if  XE(y.x)«f(y)>P(x) 

T(f)(x)=  ytv 

[  0  otherwise 

The  expression  is  often  called  net(x). 

yeV 

This  definition  of  the  transformation  rule  corresponds  to  the  synchronous  Hopfield  bi¬ 
nary  model  [4], 

The  action  of  the  net  with  the  beginning  state  fg  we  will  called  infinite  sequence  of  states 
f„,  f,  ,  ...  ,  f„  ....where  f„„=T(f„).  ^  ^  ^  ^ 

Network  S  =  (V,E,P)  is  simulated  by  the  network  S  =  (V  ,E  ,P  )  when  functions: 
g:  V  V*  and  h:  N  N,  h(0)^0,  (h  -  ascending)  exist,  that  for  all  actions  (fo,f,,f2,...)  of 
network  S,  actions  (f*  ,f*,f2 ,...)  of  network  S*  satisfying  the  condition 
(Vx  e  V)(Vk  eN)f,(x)  =  fh(k)(g(x))  exist. 

Remarks: 

1)  Function  g  defines  that  neuron  x  of  network  S  corresponds  to  neuron  g(x)  of  network 
S^ 

2)  Function  h  defines,  that  instant  k  in  network  S  action  corresponds  to  instant  h(k)  in 

network  S*  action.  ^ 

3)  More  intuitively  we  can  say:  network  S  is  simulated  by  network  S  if  the  condition  neu¬ 
ron  X  is  active  in  network  S  in  the  instant  k  when,  and  only  when,  neuron  g(x)  is  active  in 
network  S*  in  the  instant  h(k)  is  satisfied  for  all  neurons  x  and  any  instant  k. 

3.  Networks  with  only  non-negative  connection  weights 

Theorem  1. 

Any  network  S  can  be  simulated  by  the  network  S*  with  non-negative  connection  weights 
only. 

Proof.  Network  with  the  desired  features  will  be  constructed.  At  first  we  introduce 
simplified  notation  related  to  networks  S  and  S  . 

Elements  of  network  S  are  denoted  as  follows: 

1)  Set  of  nodes  (neurons)  of  network  S:  V  ==  (x^Xj  ,...x^  ). 

2)  Thresholds  of  nodes  of  network  S:  pj  -  threshold  of  node  xj,  i  =  1,2,.. .,n  (i.e.  pj  ==  P(xi)). 

3)  Connections  weights  in  network  S:  w^,  =  E(xj,xj)  -  connection  weight  from  x[  to  xj 

(may  be  negative). 

Elements  of  network  S*  are  denoted  as  follows 

1)  Set  of  nodes  (neurons)  of  network  S*:  V*  ==  {x|’,xf  ,xP,X2,...,x^,x^  }.  Notice,  that  two 
nodes  xf  and  x-  in  S*  replace  one  node  x  in  S.  Node  xf  is  called  the  primary  node  and 
xf  is  called  the  dual  node. 

2)  Thresholds  of  nodes  of  network  S'**:  pj"  -  threshold  of  node  xf,  pf  -  threshold  of  node 
xl^,  i=l,2...,n. 

3)  Connection  weights  in  network  S*:  w^'’  =  E*(xf,xf)  -  connection  weight  from  x\  to 
x^,  wj'^  =  E'^(x|’,xf)  -  connection  weight  from  xf  to  xf,  wf^^  =  E*(xf  ,xj')  -  connection 
weight  from  xf  to  xf,  wj‘^  -  E*(xf  ,x;^ )  -  connection  weight  from  x-  to  xf . 

Before  the  formal  construction  of  the  network  S*  we  describe  intuition  on  which  S  is 
based.  Node  xf  is  active  in  S*  exactly  when  xj  is  active  in  S,  and  on  the  contrary,  x^*  is 
active  in  S*  when  xj  is  non-active  in  S.  Then,  in  any  instant,  exactly  one  of  nodes  xf,  x|^  is 
active  in  S*,  and  activity  of  xf  in  S*  corresponds  to  activity  of  x,  in  S. 
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Now  we  describe  formal  construction  of  network  S  . 

Each  node  of  network  S  corresponds  to  two  nodes  of  network  S  then  each  connection 
of  network  S  corresponds  to  four  connections  of  network  S  .  Their  weights  are  defined  as 
follows: 

If  w.j>0,  then  wj’’ =  w^'*  =  Wj^,  wJ‘^=wJp=0,  and  if  w,j<0,  then  w?p=w^'^=0, 
=  w^’’  =  -Wjj.  Notice,  that  these  weights  are  non-negative. 

Definition  of  thresholds  is  more  complicated. 

Let  Ui  =  where  W“  =  {j:Wj,  <0}  i.e.  Wj'  is  the  set  of  all  nodes  which  connections 

jeV^“ 

with  negative  weight  are  coming  from  (  to  x^),  and  uj  is  the  sum  of  these  negative 
connection  weights,  dj  =  ,  where  =  >0}  i.e.Wj^  is  the  set  of  all  nodes 

jeW* 

which  connections  with  non-negative  weight  are  coming  from  ( to  xj),  and  dj  is  the  sum  of 
these  non-negative  connection  weights. 

Threshold  values  are  defined  as  follows;  p^  =  Pj  -U;  for  primary  nodes,  p-  =  -Pj  +di  H-Sj 
for  dual  nodes,  where  is  a  positive  number  smaller  than  the  minimal  distance  from  sum¬ 
marized  input  at  x{  which  does  not  activate  xj,  to  threshold  p^  (e^  which  satisfy  this 
condition  exists  because  of  binary  character  of  nodes). 

This  s;  guarantee  that  if  in  original  network  net(Xi)  =  Pi  then  in  simulated  network  only 
node  xj"  becomes  active,  without  s-  both  nodes  xj’  and  xf  will  be  active. 

Lemma  1. 

Network  S*  defined  above  simulates  original  network  S  for  function  g  defined  as  g(Xi)  =  xf 
and  function  h  defined  as  h(k)  =  k.  Proof  of  this  lemma  is  presented  in  [4]. 

Some  thresholds  in  network  S*  constructed  in  the  Theorem  1  may  be  negative.  It  may  oc¬ 
cur  even  when  in  original  network  S  all  thresholds  have  been  positive.  Network  S  can  be 
modified  to  have  all  threshold  positive,  by  modifying  connection  weight  between  nodes  xf 
and  xf .  More  details  are  presented  in  [4]. 

Corollary  1. 

Any  network  S  is  simulated  by  network  S  with  only  non-negative  connection  weights  and 
only  positive  node  thresholds. 

4.  Networks  with  weights  equal  0  or  1  and  normalized  thresholds 
Theorem  2. 

Any  network  S  with  only  non-negative  connection  weights  and  only  positive  node 
thresholds  is  simulated  by  network  S  with  all  connection  weights  equal  0  or  1  and  all  node 
thresholds  positive  and  identical. 

Proof.  Network  S*  with  the  desired  features  will  be  constructed. 

This  construction  follows  some  steps. 

1)  At  first  notice  that  any  network  can  be  transformed  to  network  with  weights  and 
thresholds  represented  by  rational  number. 

2)  Identical  threshold  of  all  nodes  equals  p  can  be  reached  by  appropriate  scaling  of  con¬ 
nection  weights.  New  weights  are  defined  by  w*  =  —  *  w  . 

P. 

3)  After  steps  1,  2  all  weights  and  thresholds  are  non-negative  rational  numbers,  then  they 
can  be  transformed  to  natural  number  by  multiply  by  appropriate  constant. 
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Network  obtained  in  the  step  3)  instead  of  initial  network  will  be  the  base  of  following 
steps.  For  simplicity,  threshold  and  weights  obtained  in  step  3)  will  be  denoted  appropriately 
p  and  Wji . 

4)  Any  connection  (with  an  integer  weight  w^j)  in  network  S  is  replaced  by  w^j  pairs  of 
connections,  each  with  weight  equals  1,  and  by  Wjj  intermediate  nodes.  Set  of  intermediate 
nodes  introduced  to  replace  connection  from  xj  to  xj  of  origin  network  is  denoted  Y- .  Set 
of  all  intermediate  nodes  is  denoted  Y. 

5)  All  nodes,  as  well  intermediate,  should  have  threshold  p,  moreover  individual  impulse 
arnved  from  original  network  node  should  activate  appropriate  intermediate  node. 

Set  Z  =  {z„Z2,...,Zp., }  of  additional  nodes  is  introduced  to  satisfy  this  condition.  Elements  of 
Z  are  always  active  and  they  are  connected  to  each  other.  Connections  from  nodes 
z,,Z2,...,Zp ,  arrive  to  every  intermediate  node  y  e  Y.  ^ 

Generally,  network  S*  is  composed  of  nodes  corresponding  to  original  network  nodes  X  , 
intermediate  nodes  Y  and  additional  nodes  Z. 

6)  It  should  be  described  which  way  network  S*  simulates  network  S,  i.e.  functions  g,  h  and 
initial  stage  f^  should  be  defined. 

Function  g  is  defined  as  g(xi)  =  x*  and  function  h  is  defined  as  h(k)  =  this  means  that 
network  S*  acts  two  times  slower  than  network  S. 

Initial  stage  f^*  is  defined  as  follows: 

(Vi)f:(x:)  =  fo(x.);  Vij(Vy  e Y,)f;(y)  =  fo(x^);  (Vz  eZ)fl{z)  =  1 . 

Construction  of  network  S*  guarantee  that  it  simulates  original  network  S. 

Corollary  2. 

Any  network  S  is  simulated  by  network  S*  with  all  connection  weights  equal  0  or  1 
and  all  node  thresholds  positive  and  identical, 

Corollary  2  follows  directly  from  Corollary  1  and  Theorem  2. 

5.  Conclusion 

First  theorem  has  theoretical  and  practical  meanings.  It  shows,  that  the  network  with  inhibi¬ 
tory  (negative)  connections  can  always  be  replaced  by  network  with  only  excitatory 
(positive)  connections.  This  theorem  also  gives  method  of  construction  of  simulating  net¬ 
work,  which  has  only  twice  more  nodes  and  non-zero  weighted  connections  than  original 
net.  Networks  with  only  one  kind  of  connections  are  often  easier  for  hardware  implementa¬ 
tion,  even  if  they  have  some  more  nodes.  For  instance  they  are  simpler  to  optical  imple¬ 
mentation.  Second  theorem  has  rather  theoretical  meanings.  It  shows,  that  only  one  value  of 
threshold  and  only  one  value  of  non-zero  connection  weight  can  be  used  to  simulate  any 
neural  network.  Practical  application  of  this  theorem  may  by  in  general  difficult  because 
transformed  networks  can  have  much  more  nodes  than  the  original. 
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Abstract.  The  paper  presents  the  obtained  results  of  learning  of  a  two-layer  neural  network  (TLNN), 
the  results  of  tolerance  to  noise  at  the  weight  matrices  and  the  results  of  hardware  implementation. 
The  imperfections  of  optics  satisfy  the  requirements  of  the  TLNN  model. 


1.  Introduction 

Different  types  of  neural  networks  (NN)  are  under  investigation  now.  Creating  the  working 
models  of  optical  NNs  is  most  useful  and  promising  [1].  One  of  the  most  interesting  is  the 
TLNN,  which  can  be  easily  made  on  the  basis  of  a  proposed  optical  matrix-vector  multiplier 
(OVMM).  Computing  in  TLNN  is  performed  sequentially  in  OVMM,  high  performance  speed 
is  available  due  to  the  implementation  of  the  high-speed  high  dynamic  range  multichannel 
multifrequency  acoustooptic  modulator  (MAOM),  laser  array  and  CCD  (charge-coupled 
device)  array. 


2.  NN’s  mathematical  models  suitable  for  optoelectronic  implementation 

Due  to  the  fact  that  the  optoelectronic  devices  are  the  most  effective  for  realization  of  the 
vector-matrix  multiplication,  it  is  very  useful  to  put  the  main  calculation  load  on  them  and  to 
simplify  the  non-linear  transformations  as  much  as  possible.  That  is  why  it  is  possible  to 
assume  that  neural  networks  with  the  same  transformations  for  all  neurones  (homogeneous 
NN)  are  the  most  suitable  for  the  optoelectronic  realizations. 

From  the  viewpoint  of  hardware  implementation  the  most  promising  architecture  is  the 
multilayer  neural  network.  As  it  was  proved  [2]  such  networks  may  solve  tasks  of  image 
recognition.  The  amount  of  interconnection  in  such  networks  is  less  than  in  completely- 
connected  ones.  This  reduces  the  complexity  of  the  hardware  realization. 
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3.  TLNN  Hardware 

Optical  realization  of  the  TLNN  is  an  exmnple  of  the  effective  application  of  MAOM.  The 
main  part  of  the  calculation  in  this  case  is  vector-matrix  multiplication:  AxB=C.  The  proposed 
OVMM  architecture  (fig.  1)  allows  one  to  perform  this  in  one  step. 

The  algoridim  we  propose  is  i^milar  to  DMAC  (digital  multiplicaticwi  via  analogue 
convolution)  [3],  but  there  is  one  significant  difference  concerned  with  the  ability  to  use 
frequency  coding  in  MAOM.  Due  to  this  fact  OVMM  allows  one  to  multiply  (convolve)  2D 
and  3D  data  flows  and  obtain  the  whole  vector-matrix  multiplication  in  one  step  with  speeds 
about  the  frame  rate  of  MAOM.  In  this  case  the  vector  As  a  2D  array  is  input  into  the  laser 
array  (LA).  The  elements  of  the  3D  array  (matrix  B)  are  input  into  the  MAOM:  i  is  the 
channel  number,;  is  the  pulse  position  in  the  channel  aperture  at  the  exposure  moment,  s  is  the 
acoustic  wave  frequency  number  (fs).  When  the  MAOM  aperture  is  filled  the  LA  executes 
eight  (according  to  the  number  of  bits  in  each  vector  clement)  light  pulses  (expositions). 
Radiation  from  each  laser  uniformly  illuminates  the  corresponding  “window”  of  every  channel. 
The  optical  system  transfers  the  light  diffracted  in  the  MAOM  onto  a  set  of  eight  linear  CCD 
arrays.  Each  linear  CCD  contains  eight  (according  to  the  number  of  frequencies)  points.  The 
CCD  operates  in  the  shift-and-add  mode.  The  shift  velocity  corresponds  to  the  bit  loading 
speed  of  the  lasers.  The  charge  is  accumulated  at  a  point  of  a  linear  CCD  while  the  first 
exposition  is  shifted  to  the  next  point  of  this  CCD  for  a  second  exposition.  Each  linear  CCD 
represents  a  component  of  the  output  vector  C  in  a  mixed  representation.  The  supporting 
electronic  control  system  (CS)  is  responsible  for  converting  the  presentation  of  this  data  from 
mixed  to  hex  format,  the  thresholding,  the  synchronization,  etc. 


MAOM  f,  ...  f,  ...  fg 


Fig.  1.  Vector-matrix  multiplier. 
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a)  b)  c) 

Fig.  2.  Example  of  DMAC  multiplication  with  implementation  of  multiplexing:  a)  lazer  pulses, 
b)  multiplexing  signal  in  acoustooptic  cell,  c)  CCD  output  signal  in  a  mixed  representation. 


To  check  the  possibility  of  performing  DMAC  in  the  described  way,  a  breadboard 
model  performing  multiplication  of  8  digit  numbers  was  assembled.  The  rate  of  bit-by-bit 
entering  was  limited  by  the  CCD  airay  transfer  frequency.  The  typical  resulting  signi  of 
DMAC-multiplication  with  implementation  of  multiplexing  in  the  MAOM  being  registered 
directly  at  the  CCD  output  is  presented  in  Fig.  2.  Taking  into  account  the  utmost  parameters 
of  Te02  for  MAOM  as  follows:  100  “windows”,  8  frequencies,  the  period  of  laser  pulse 
repetition  is  about  10  ns  and  the  CCD  array  transfer  frequency  is  about  120  MHz. 


4.  TLNN  Noise  tolerance 

The  TLNN  simulation  software  program  was  designed.  The  NN  was  trained  with  gradient 
and  stochastic  algorithms. 

In  the  hardware  implementation  of  TLNN  each  parameter  could  differ  from  the 
calculated  one.  This  could  happen  because  the  technology  of  optoelectronic  NN  fabrication 
does  not  allow  the  exact  representation  of  the  calculated  meanings  of  parameters. 

The  aim  of  statistical  experiments  on  noise  was  the  definition  of  tolerance  to  deviation 
of  meanings  of  the  parameters  from  the  calculated  values.  The  criterion  was  the  ability  to 
recognize  all  the  learned  images. 

As  a  result  of  modelling  the  men  value  of  misrecognition  and  the  mean  data  of 
unrecognized  images  are  defined.  The  results  of  the  experiments  are  the  following.  The 
TLNN  retains  the  ability  of  correct  recognition  in  two  cases: 

(i)  The  noise  ratio  (NR)  of  synaptic  weights  is  less  than  15%.  The  NR  of  all  other  parameters 
is  less  than  5%. 


(ii)  The  NR  of  all  the  parameters  is  less  than  10%. 
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Fig.  3.  Noise  toleration  of  the  TLNN. 


So,  it  is  proven  that  the  technology  of  optoelectronic  NN  falxication  which  allows  to 
present  all  calculated  parameters  in  hardware  with  accuracy  better  than  10%  could  be 
implemented  for  the  hardware  realization  of  TLNN. 

Figure  3  presents  the  plot  of  noise  tolerance  of  all  the  parameters  of  TLNN.  It  is 
important  to  note  the  linear  decreasing  of  the  number  of  correctly  recognised  images  (N) 
against  the  NR  of  TLNN’s  parameters.  While  NR=60%  the  TLNN  retains  the  ability  of 
correct  recognition  of  50%  of  the  images.  For  computer  systems  of  previous  generations 
another  type  of  plot  is  inherent,  namely  avalanche  dependence:  100%  recognition  at  weak 
noise  and  0%  recognition  beyond  some  noise  level. 

5.  Conclusion 

The  described  TLNN  model  is  fast,  versatile,  stable  and  easy  to  implement.  The  computer 
expOTinent  shows  that  such  TLNN  has  significant  noise  resistance.  These  facts  make  the 
TLNN  attractive  for  hardware  realization. 
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Features  of  Application  of  Holographic  Memories  in  PC 
Networks  and  Optical  Neural  Computers 


V.N.  Shahgedanov 

Moscow  Scientific  &  Research  Institute  of  Instrument  Engineering,  Russia 


Abstract  The  characteristics  of  experimental  prototypes  of  disc  holographic  memories  of 
1.5  Gb  capacity  and  with  16  Mbs'*  data  reading  rate  are  given.  The  features  of  interfacing 
these  devices  with  personal  computers  are  discussed,  as  well  as  the  question  of  the 
application  of  the  data  recorded  in  a  holographic  disc  as  a  command  matrix  for  a  neural 
computer. 


Optical  memory  systems  have  considerable  advantages  over  the  traditional  memory  systems, 
these  include  higher  density  of  recording,  higher  rate  of  data  reading  and  transmission,  and 
storage  of  the  information  for  long  periods  of  time.  Optical  disc  systems  recording 
information  in  a  form  of  holograms  prove  extremely  promising  for  storage  of  the  large 
volumes  of  information  (exceeding  10  Gb).  Usage  of  1-D  holograms  is  one  way  of 
implementing  a  multichannel  recording  device  (Mikaelyan,  1981).  Being  a  compromise 
between  the  2-D  holographic  systems  and  optical  bit  systems,  1-D  systems  retain  the 
advantages  of  both  and  fit  well  into  the  architecture  of  a  computer. 

Development  of  an  element  base  for  a  1-D  memory  system  is  less  complicated  in 
comparison  with  the  element  base  of  a  2-D  system.  At  present,  multichannel  devices  based  on 
the  1-D  holograms  are  being  developed  to  be  used  as  high  speed  data  recorders  which  are 
employed  in  various  systems  of  data  processing  and  storage  and  as  an  external  storage  in 
computer  networks.  We  suggest  that  a  holographic  disc  may  be  used  as  a  command  matrix 
for  neural  computers.  Characteristics  of  a  memory  system  prototype  based  on  1-D 
holographic  recording  are  presented  in  Table  1.  For  successful  implementation  of  the 
holographic  memory  systems  in  PC  networks  and  in  the  optical  neural  computers,  high  speed 
processing  is  required.  This  is  why  the  major  feature  of  the  devices  is  implementation  of 
specially  designed  lines  of  photodiodes  based  on  the  metal-resistor-semiconductor  (MRS) 
structure.  Due  to  a  high  conductivity  of  the  resistive  layer,  the  MRS-structure  works  under  a 
constant  voltage.  Homogeneity  of  the  process  in  the  MRS-structure  is  achieved  through  the 
emerging  negative  feedback  on  the  border  of  a  semiconductor-resistor  layer.  Characteristics 
of  the  photodiodes  are  presented  in  Table  2. 
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Table  1 .  Characteristics  of  Memory  System  Prototypes 


Technical  Characteristics 

HMS 

Univa^al  HMS 

C^tical  Memory  System 
APX-400 

Disc  capacity  (one  side),  GB 

1.5 

0.4 

0.4 

Data  transmission  rate,  Mbsec'^ 

2.0 

16.0 

0.625 

Seeking  &  reading  time  for  data 
volume  of  1Mb,  sec 

0.75 

0.31 

4.0 

Probability  of  an  error 

10'^^ 

lO'*^ 

10‘^ 

HMS  -  holographic  memory  system 


Table  2.  Photodiodes  Charactciistics 


Wavelength,  pm 

0.63 

Number  of  photosensitive  elements 

36 

Noise  equivalent  power,  nW 

<5 

Speed,  MHz 

>20 

Multiplying  factor 

>1000 

Factor  of  connection  between  the  elements 

<5% 

Step  between  elements,  pm 

250 

Size  of  the  elements,  pm 

100x100 

The  analysis  shows  that  the  use  of  a  direct  current  amplifier  for  a  photodiode  device  at 
low  levels  of  the  detected  optic^  sign^;  (which  is  typical  of  a  holographic  memory  system) 
demands  complicated  methods  of  reconstruction  of  a  constant  component  and  control  of 
comparator  threshold.  A  simpler  solution  involves  the  installation  of  a  low  frequency  filtering 
capacitor,  i.e.  the  use  of  an  alternating  current  amplifier.  This,  in  turn,  suggests  a  pre-set 
preparation  of  the  recorded  signal  which  is  achieved  through  the  use  of  positional  codes  which 
do  not  contain  a  constant  component.  Along  with  the  certain  advantages  (such  as,  for 
example,  simpler  arithmetic),  the  positional  codes  possess  a  number  of  drawbacks.  A 
disadvantage  of  a  technical  form  is  that  a  wide  transmission  band  is  required  when  transmitting 
coded  signals.  Indeed,  long  sequences  of  “0”  and  “1”  codes  are  transmitted  by  a  zero 
frequency  signal,  while  alteration  of  “0”  and  “1”  codes  (width  To)  demands  the  upper 
transmission  limit  to  be  not  less  than  1.5  To’^Hz.  This  is  a  strict  requirement  which  results  in 
various  technical  problems.  A  desire  to  avoid  zero  frequency  has  led  to  a  number  of  codes 
where  the  lower  frequency  is  shifted  in  time  (Kovalenkov,  1989).  These  codes  are  either 
partially  or  completely  compensated,  i.e.  the  constant  component  is  independent  of  codes.  A 
particular  example  of  a  completely  compensated  code  is  the  nxKlified  frequency  modulated 
(MFM)  Miller’s  code.  This  code  is  distinguished  by  a  relatively  simple  technical  realisation. 
As  it  consists  of  the  three  carrier  frequencies,  it  does  not  require  expansion  of  the  band 
towards  the  high  frequencies.  Due  to  its  qualities.  Miller’s  code  was  widely  accepted 
originally  for  magnetic  recording  devices,  later  in  digital  sound  recorders,  and  at  present  in  the 
optical  recording  discs  and  in  the  holographic  recording  discs  in  particular. 
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An  algorithm  of  transformation  of  binary  code  into  the  MFM  code  is  presented  in  Fig. 
la.  Fig.  lb  presents  a  scheme  of  a  coder  transforming  an  input  code  Di  into  the  MFM  code. 
The  synchronising  frequency  of  Di  is  Fo;  an  accompanying  input  frequency  is  2Fo.  The 
positively  rectifying  differentiator  forms  the  trigger  input  signal  for  a  flip-flop  unit  which  then 
forms  the  MFM  signal.  The  widths  of  rectangular  pulses  are  To,  1.5To  and  2To  (Fig.  la).  The 
maximum  frequency  does  not  exceed  0.5To*\  and  the  minimum  frequency  is  0.25To'\  i.e.  the 
lower  limit  of  the  transmitting  band  is  0.25To'^Hz,  and  the  upper  limit  is  0.75To*^Hz.  When 
such  an  MFM  signal  is  decoded  by  a  decoder  shown  in  Fig.  Ic,  the  resulting  code  is  the  initial 
Di  shifted  by  0.5To. 
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Fig.  1  a.  Combined  Time  Diagram  of  "i"  Channel  of  the  MFM  Coder  and  Decoder. 


Fig.  1  b,  c.  Functional  Diagram  of  "i"  Channel  of  the  MFM  Coder  (b)  and  MFM  Decoder  (c). 
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Optoelectronic  Fuzzy  control  of  an  inverted  pendulum  using 
light-emitting-diode  arrays  and  position-sensing-devices 

H  Itoh,  B  Houssay  and  S  Mukai 

Optoelectronics  Division,  Electrotechnical  Laboratory,  Tsukuba,  Ibaraki,  305  JAPAN 

T  Yamada  and  S  Uekusa 

School  of  Science  and  Technology,  Meiji  University,  Higashi  Mita,  Kanagawa,  214  JAPAN 

Abstract.  For  the  first  time  real-time  optoelectronic  fuzzy  control  of  an  inverted 
pendulum  is  realized.  The  system  uses  arithmetic  product-sum -gravity  method. 
Antecedent  and  consequent  operations  using  Gaussian-like  membership  functions 
and  center-of-gravity  operations  are  realized  by  combinations  of  LED  array  and 
a  position-sensing-device. 


1.  Introduction 

High-speed  optoelectronic  analog  fuzzy  inference  technique  has  been  proposedfl]  using  beam¬ 
scanning  laser  diodes  (BSLDs)[2]  and  position-sensing-devices  (PSDs).  The  inference  process 
uses  an  arithmetic  PRODUCT-SUM  Gravity  method  with  Gaussian  membership  functions  and 
the  controllability  of  the  inference  method  is  better  than  conventional  MEN-MAX  Gravity 
method  with  triangle  membership  functions[3].  The  inference  speed  of  the  system  will  be 
more  than  several  tens  of  MFLIPS  (Mega  Fuzzy  Logical  Inference  Per  Second)  using  high¬ 
speed  operation  of  the  beam  scanning  laser  diodes.  However,  a  real  system  control  using  the 
BSLDs  has  not  been  yet  realized  because  of  the  present  low  resolution  of  their  far-field 
pattern. 

In  this  paper,  configurations  of  optoelectronic  antecedent  and  consequent  fuzzy  operation 
units  using  an  light-emitting-diode  (LED)  array  and  a  PSD  are  proposed.  Furthermore,  the 
first  optoelectronic  control  system  using  the  consequent  unit  is  realized  and  its  usefulness  is 
demonstrated  using  an  inverted  pendulum. 


2.  Optoelectronic  analog  antecedent  and  consequent  units 

Figure  1  shows  a  schematic  configuration  of  a  fuzzy  antecedent  grade  evaluation  unit,  which 
calculates  the  satisfaction  of  a  fuzzy  antecedent  rule.  In  this  figure,  1,  2  and  3  serially-wired 
LEDs  are  connected  with  the  input  in  parallel.  1,  2  and  3LEDs  start  emission  in  turn  with 
increasing  the  applied  voltage.  Because  a  PSD  can  detect  a  center-of-gravity  of  the  radiation 
pattern,  Gaussian-like  membership  function  is  realized  by  position  and  emitting  characteristics 
of  the  LEDs.  Figure  2  shows  a  schematic  result  of  several  membership  functions.  Peak 
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Fig.l  OpLoelecironic  imLcccdcni  grade  cvaluaiion  uniL  F‘g-2  Examples  of  membership  functions  of  an 

using  an  LED  array  and  a  PSD  optoelectronic  antecedent  operation  unit  using 

an  LED  array  and  a  PSD. 

position  of  the  function  is  controlled  by  value  of  the  resistors  and  level-shift-diodcs. 

Figure  3  shows  a  schematic  configuration  of  an  optoelectronic  fuzzy  consequent 
operational  unit.  Unified  output  membership  function  by  superposition  of  LED  radiations  and 
defuzzification  of  the  radiation  patterns  are  realized  by  an  LED  array  and  a  position  sensing 
device.  Figure  4  shows  an  example  of  defuzzification  of  the  unit.  Unequal  interval  of  each 
PSD  output  reflects  the  separation  of  each  LED  of  the  array. 


INPUTS 


Fig.3  OploclccLronic  consequent  operation  unit 
using  an  LED  array  and  a  PSD 
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Fig.4  Output  sample  of  a  consequent  operation  unit 
using  an  LED  array  and  a  PSD. 


3.  Inverted  pendulum  control  system 


Figure  5  shows  a  schematic  configuration  of  optoelectronic  fuzzy  inverted  pendulum  control 
system.  A  50  cm-long  free-rotating  inverted  pendulum  is  on  a  cart.  The  cart  is  on  a  100 
cm-long  rail  and  is  driven  by  a  stepping  motor.  Input  data  for  the  fuzzy  control  are  angle 
of  the  pendulum,  which  is  measured  by  a  rotary  encoder,  and  angle  velocity  of  the  pendulum, 
which*  is  calculated  by  the  difference  of  measured  angles.  Consequent  fuzzy  inference 
processes  are  realized  by  the  optoelectronic  operation  unit  and  antecedent  fuzzy  inference 
processes  are  simulated  by  a  personal  computer. 


All 


STEPPING  MOTOR 


Fig,5  Inverted  pendulum  fuzzy  control  system  with  an  optoelectronic/  electronic  consequent 
operation  unit 


Figure  6  shows  schematic  experimental  results  of  various  experimental  conditions.  Angle 
of  a  controlled  inverted  pendulum  after  initial  perturbation  (40  cm/sec,  0.3  sec)  is  displayed 
as  a  function  of  time.  Antecedent  Fuzzy  operations  are  numerically  simulated  by  a  Personal 
Computer.  Cases  of  optoelectronic  control  "OE"  and  pure  electronic  fuzzy  control  "E"  are 
shown  in  the  figure.  Control  periods  of  "OE"  and  "E"  cases  are  0.015  sec/cycle  and  0.08 
sec/cycle,  respectively.  Difference  of  the  cycle  time  are  from  introduction  of  optoelectronic 
processing.  Control  gain  of  "HIGH"  in  the  figure  6  is  4.5  times  than  that  of  "LOW".  The 
results  show  the  superior  controllability  of  the  optoelectronic  system. 


Time  (sec) 


Fig.6  Traces  of  the  angle  of  an  inverted  pendulum  controlled  by  optoelectronic  systems  and 


electronic  systems 
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4.  Conclusion 

Antecedent  and  consequent  fuzzy  operation  units  using  a  LED  array  and  a  position  sensing 
devcce  were  proposed  and  the  characteristics  were  described.  For  the  first  time,  real-time 
control  of  an  inverted  pendulum  was  realized  based  on  an  optoelectronic  analog  fuzzy 
inference  system.  Further  better  controllablity  was  demonstrated  using  an  optoelectronic 
consequent  Fuzzy  inference  control  unit.  This  result  comes  from  the  operational  speed  of 
optoelectronic  implementation  is  faster  than  that  of  numerical  simulation. 

The  authors  thank  Yokoyama  I  for  his  technical  support.  Also  the  authors  thank  Mori  M, 
Watanabc  M,  Flidaka  T  and  Yajima  H  for  fruitful  discussions. 
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Independently  Addressable  Vertical-Cavity 
Surface-Emitting  Laser  Diode  Arrays 
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Abstract.  The  fabrication  and  testing  of  very  uniform,  planar  proton- 
implanted  independently  addressable  10x10  vertical-cavity  surface-emitting 
laser  diode  arrays  is  reported.  Individual  elements  of  12  fim  active  diameter 
show  threshold  currents  around  4  mA  and  emit  up  to  about  300  //W  single¬ 
mode.  An  electrical  3  dB  modulation  bandwidth  of  4.6  GHz  is  determined. 
The  potential  total  transmission  rate  of  the  array  is  several  hundred  Gbit/s. 


1.  Introduction 

Independently  addressable,  two-dimensional  laser  diode  arrays  have  many  applications 
for  example  in  optical  scanners,  display  technology,  optical  interconnects,  high  capacity 
switching  systems,  or  for  the  generation  of  high  power  coherent  beams  [1].  Using  planar 
vertical-cavity  surface-emitting  lasers  (VCSELs),  such  arrays  with  high  packing  densities 
are  relatively  easy  to  fabricate  [2].  In  addition  VCSELs  with  small  active  diameters 
allow  dynamic  single- mode  oscillation,  very  low  threshold  currents  [4],  and  alignement 
tolerant  and  efficient  coupling  into  single-mode  fibers  [3].  Here  we  report  on  10x10 
independently  addressable  vertical-cavity  laser  diode  arrays  with  active  diameters  of  12 
fim  and  an  excellent  homogeneity  of  the  output  characteristics  over  the  whole  array  size. 
3  dB  modulation  bandwidths  of  single-mode  emitting  array  lasers  are  as  high  as  4.6  GHz 
at  an  optical  output  power  of  340  ^W. 


2.  Laser  Structure  and  Device  Fabrication 

The  laser  structure  under  investigation  was  grown  by  molecular  beam  epitaxy.  The 
one  wavelength  thick  central  region  contains  three  8  nm  thick  Ino.2Gao.8As  quantum 
wells  which  are  embedded  in  GaAs  barriers  and  Alo.3Gao.7As  cladding  layers  for  efficient 
carrier  confinement.  It  is  surrounded  by  two  semiconductor  Bragg  reflectors.  The  upper 
mirror  is  p-doped  and  consists  of  18  quarter- wavelength  AlAs/GaAs  pairs,  the  lower 
is  n-doped  and  has  22.5  pairs.  To  reduce  the  resistivity  of  the  reflectors  we  used  two 
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intermediate  Alo.3Gao.7As  and  Alo.7Gao.3As  steps  and  enhanced  doping  concentration  of 
8-10^®  cm"^  near  the  heterointerfaces  driven  in  backward  direction.  During  growth  of  the 
upper  AlGaAs  cladding  layer  the  rotation  of  the  substrate  was  interrupted  to  produce  a 
spatial  thickness  variation  of  the  central  lasing  region.  The  thickness  variation  results  in 
a  spectral  shift  of  the  resonator  mode  across  the  wafer. 

Lateral  current  confinement  and  insulation  of  the  individual  lasers  is  achieved  by 
proton  implantation  with  an  energy  of  300  keV  and  a  dose  of  5-10^'*  cm  The  reflectivity 
of  the  upper  Bragg  reflector  is  improved  with  an  additional  Au  layer  on  top.  TiAu  rings 
form  the  ohmic  p-contacts.  The  100  elements  are  connected  to  25  bonding  pads  at  either 
side  of  the  5x5  mm  array  chip  by  TiAu  feeding  lines  which  are  deposited  on  an  insulating 
SiO  layer.  A  schematic  of  the  vertical  structure  and  a  top  view  of  a  full  chip  is  depicted 
in  Fig.  1.  Individual  lasers  have  active  diameters  of  12  /im  and  the  pitch  size  of  the  array 
is  250  /im. 


Fig.  1:  Vertical  structure  and  top  view  of  an  VCSEL  array  fabricated. 


3.  Output  Characteristics 

All  processed  arrays  show  a  very  good  homogeneity  of  the  output  characteristics  over 
the  whole  array  size.  Fig.  2  depicts  a  typical  threshold  current  histogram  of  an  array. 
Almost  all  lasers  exhibit  threshold  currents  between  4.1  mA  and  4.7  mA  with  an  average 
value  of  4.4  mA.  Each  laser  has  a  maximum  single-mode  output  power  in  the  order  of 
300  /iW.  Above  this  level  higher  order  transverse  modes  appear  in  the  emission.  The 
maximum  output  power  of  about  500  /xW  is  restricted  by  thermal  roll-over  due  to  still 
high  threshold  voltages. 

Due  to  the  wedge  shaped  central  region  the  emission  wavelength  of  individual  lasers 
within  the  array  slightly  changes  with  position.  Fig.  3  shows  emission  wavelengths  at 
driving  currents  of  5  mA.  Emission  between  994.3  nm  and  998.3  nm  makes  the  array  well 
suited  for  wavelength  division  multiplexing  systems.  Using  different  growth  conditions 
the  shift  of  the  emission  wavelength  can  be  adjusted  to  vary  up  to  30  nm  across  the  array 

[5,  6). 
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Fig.  2:  Histogram  of  threshold  currents.  Fig.  3:  Emission  wavelength  distribution. 


Fig.  4  illustrates  small-signal  modulation  responses  of  an  individual  lasing  element 
for  different  driving  currents  above  threshold  together  with  corresponding  light  output 
characteristics.  With  increasing  driving  current  the  modulation  bandwidth  increases  up 
to  6.2  mA,  where  the  next  order  transverse  mode  starts  to  oscillate.  A  maximum  electri¬ 
cal  3  dB  modulation  bandwidth  of  4.6  GHz  and  a  5  GHz  optical  bandwidth  are  obtained 
for  single-mode  emission.  Crosstalk  to  adjacent  lasers  both  biased  above  threshold  was 
measured  to  increase  with  increasing  frequency  and  to  remain  below  -15  dB  at  4  GHz. 
Crosstalk  is  mainly  caused  by  non-optimized  microwave  packaging.  Operating  all  100 
elements  in  parallel  bit  rates  of  several  hundred  Gbit/s  can  be  obtained. 


Fig.  4:  Light  output  characteristics  and  electrical  small-signal  modulation  response. 
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The  output  characteristics  of  the  arrays  can  further  be  improved  if  laser  structures 
with  lower  series  resistances  are  used  and  when  the  arrays  are  solder  bonded  with  a 
flip'Chip  bonding  technique  on  insulating  heat  sinks.  Tests  with  individual  lasers  already 
indicated  that  maximum  output  powers  and  modulation  bandwidths  can  be  enhanced  if 
proper  heat  sinking  is  applied  [7]. 


4.  Conclusion 

We  have  fabricated  10x10  independently  addressable  planar  proton  implanted  laser  diode 
arrays  with  12  /im  active  diameters.  The  output  characteristics  of  the  lasers  within  the 
array  are  very  uniform.  The  elements  show  an  average  threshold  current  of  4.4  mA, 
maximum  output  powers  of  about  500  /xW  and  optical  3  dB  modulation  bandwidths  of 
5  GHz  for  single- mode  emission.  These  results  suggest  that  VCSEL  arrays  are  ideally 
suited  for  data  transmission  of  several  100  Gbit/s  in  wavelength  division  multiplexing 
systems  or  high  capacity  multi-channel  optical  interconnects. 
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Nonuniformity  Tolerance  and  Its  Price  for  Different  Modes  of 
Operation  of  FET-SEED  Smart  Pixel  Arrays 

L.  M.  F.  Chirovsky^®^,  A.  L.  Lentine^\  T.  K.  Woodward^‘^\  G.  Livescu^®^  and  G.  D.  Boyd^^^^ 
AT&T  Bell  Laboratories,  ("^Murray  Hill,  NJ  07974,  ^^^Naperville,  IL  60566,  (^^Holmdel,  NJ  07733 


Abstract.  We  quantify  the  nonuniformity  tolerance  of  FET-SEED  circuit  arrays  and  show  its 
dependence  on  optical  output  contrast,  switching  speed,  available  laser  power,  and  the  mode  of 
operation.  This  analysis,  utilizing  recent  experimental  data,  then  shows  that  the  preferred  mode  of 
operation  may  be  either  with  differential  optical  signals  or  with  single-ended  optical  signals, 
depending  on  the  circumstances. 


Although  a  particular  kind  of  FET-SEED  Smart  Pixel  circuit  may  have  acceptable  performance  for  a  given 
application,  one  cannot  assume  immediately  that  all  the  Smart  Pixels  in  a  large  array  can  exhibit  the  same 
performance  in  a  system.  Neither  the  circuits  nor  their  concomitant  optical  beams  can  be  identical  even  if 
designed  to  be  so.  There  will  be  variations  or  nonuniformities  due  to  many  different  causes.  Thus  the  cir¬ 
cuit  and  system  designs  should  minimize  the  effects  of  possible  variations,  and  failing  that,  be  tolerant  to 
the  lingering  effects  of  nonuniformities,  so  that  all  pixels  operate  properly  under  one  global  set  of  condi¬ 
tions.  In  recent  years,  as  sources  of  nonuniformity  were  identified,  iterative  redesign  has  significantly 
reduced  the  size  of  variations  which  occur  in  FET-SEED  circuits  and  the  associated  system  hardware  [1,2]. 
The  use  of  "dynamic  latching"  greatly  dissipates  the  effects  of  many  of  the  nonuniformities  not  eliminated 
[3,4].  The  remaining  variations  can  be  dealt  with  as  if  due  solely  to  differences  in  the  amount  of  optical 
power  in  the  beams  incident  on  the  modulators  in  the  FET-SEED  circuits,  hereafter  called  the  "read 
beams".  The  principal  issue  is  to  guarantee  that  the  two  possible  digital  signals  (logical  "1"  and  "0")  are 
correctly  recognized  as  such,  as  data  is  optically  transmitted  from  pixel  to  pixel,  despite  the  fact  that  the 
read  beam  powers,  Pj. ,  are  not  all  equal.  Recall  that  data  is  optically  transmitted  by  a  FET-SEED  circuit 
when  a  read  beam  impinges  on  an  output  modulator,  that  is  either  in  a  high  reflectivity  state,  R^i,  for  one 
logic  state,  or  a  low  reflectivity  state,  RlO’  the  other  logic  state.  The  output  signals  are  then  either  RniPr 
or  RLo^r  •  Their  ratio,  Kq  =  R^^j/RlO’  is  often  called  the  modulator  output  contrast. 

A  mode  of  FET-SEED  circuit  operation,  where  optical  data  encoding  is  done  with  transmitters  which 
use  only  one  modulator  is  called  single-ended.  If  Rlo  is  not  zero,  and  Kq  is  finite  (as  is  usually  the  case  for 
SEED  modulators)  then  one  can  intuitively  see  that  for  single-ended  operation,  a  nonuniformity  problem 
can  arise  when  P^  ^  P^j,  where  i,j  denote  the  ith,  jth  pixel  in  an  array.  For  instance,  when  Pj-j  »  Pjj,  then 
either  RloPh  becomes  too  large  for  one  logic  state  or  RniPrj  becomes  too  small  for  the  other  logic  state,  as 
one  tries  to  globally  adjust  all  the  Pj.  powers  together.  To  date,  this  problem  has  usually  been  dealt  with  in 
systems  using  FET-SEED  circuit  arrays  by  operating  in  a  mode  where  the  data  encoding  is  differential.  Each 
transmitter  has  two  modulators,  configured  to  settle  into  complementary  states,  and  receives  two  read 
beams,  Pj.j}  and  Pri2-  The  two  digital  states  are  thus  represented  by  complementary  signal  pairs  (R^iPni, 
RLoPri2)  (RloPhI’  PHlPri2)-  The  optical  receivers,  already  having  dynamic  latching  for  its  benefits,  are 
then  designed  to  respond  to  the  difference  in  power  in  a  signal  pair,  so  that  the  two  logic  states  are  registered 
by  (RhiPhI  -  RLoPri2)  or  (RloPhi  -  PHlPri2)- there  is  local  uniformity  (where  Prii  -  Pri2  =  Pn,  Prji  "=  Pij2 
~  Pj.j)  and  only  global  non-uniformity  (P^i  Prj),  then  in  differential  operation  the  digital  signals  are  Pri(AR) 
or  Pri(~AR);  Prj(AR)  or  Prj(-AR);  etc.,  where  AR  =  R^i  -  Rlo-  Then,  once  all  P^  are  made  large  enough,  all 
digital  signals  will  be  correctly  recognized  even  if  P^j  »  Pfj  or  vice-versa.  Thus  differential  operation  is 
generally  considered  much  more  nonuniformity  tolerant  than  single-ended  operation  and  essentially 
dependent  on  AR  rather  than  Kq. 

Some  of  the  assumptions  underlying  the  above  assertion  do  not  hold  steadfastly  under  close  scrutiny. 
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Figure  I .  Plots  of  the  limits  on  the  launched  laser  energy,  E;,  in  units  of  (G  for  differential  (lower  limit,  Ey)  and 
for  single-ended  (lower  limit,  E^si;  upper  limit,  Ei,2)  modes  of  operation  of  FET-SEED  Smart  Pixel  Arrays,  versus 
signal  contrast,  Kq,  in  units  of  M,  the  nonuniformity  tolerance  margin.  Superimposed  in  the  figure,  is  a  plot  of  E/m, 
the  maximum  allowed  launched  laser  energy  for  the  particular  system  application  case  given  in  the  figure.  The  position 
of  E/m  can  change  with  every  application.  In  the  region  below  E/^  and  above  E/d  or  E/g/,  variations  in  the  absolute 
value  of  E/  are  tolerated  above  and  beyond  the  nonuniformity  variations  across  an  array  characterized  by  M. 

The  two  principal  ones  are  that  there  is  local  uniformity  (?„/  =  Pri2.  etc.)  and  that  the  P/s  can  be  made  as 
large  as  necessary.  Recent  measurements  of  beam  powers  in  a  real  system  [2,5]  showed  that,  effectively, 
the  local  and  global  variations,  as  defined  above,  are  about  the  same.  Saturation  and  heating  effects  in  the 
modulators  as  well  as  limited  laser  source  power,  for  read  beams,  conspire  to  set  an  upper  limit  on  the  P/s. 
The  consequences  of  these  facts  can  be  quantified  with  the  following  analysis. 

To  deliver  through  a  transmitter  to  a  FET-SEED  receiver  which  has  a  switching  energy,  E^,^,  a 
source  laser  must  launch  an  energy,  E/.  These  two  energies  have  the  relationship  given  by  the  equation: 

ErGE^^,  -i. 

where  (G)'*  is  the  efficiency  with  which  launched  energy  is  converted  to  switching  energy.  (G)  is  reduc¬ 
ible  to  the  “Figure  of  Merit”  (FM)  discussed  in  [6].  For  differential  operation  G^  is  given  by: 

Gj  =  [a|a2  (RhiE/ii  “  ^LoE;i2^  ]  ’  f 

where  P/^  is  the  average  laser  launched  power  per  beam,  aj  is  the  attenuation  factor,  due  to  system  optics, 
between  the  laser  and  a  transmitter’s  modulator,  and  aj  is  the  attenuation  factor,  due  to  system  optics, 
between  transmitters  and  receivers.  (Note  that  by  these  definitions  P^j  =  a/P/i,  etc.  and  P/^  =  P//  N,  where  P/ 
is  the  average  laser  power  and  N  is  the  number  of  pixels).  The  required  laser  launched  energy  for  differen¬ 
tial  operation,  E/^j,  is  then  given  by: 


Now  let  the  beam  powers  vary  such  that  over  N  pixels, 

where  X  is  a  fraction  <  I,  i  =  (I,  2, ...  N)  and  a  =  1, 2. 

Also  define  the  nonuniformity  tolerance  margin,  M,  by 

M=  (I  +x)/(I  -  X)  .  _  (5) 

(Note  that  M  is  the  ratio  of  the  largest  over  the  smallest  beam  power.)  Then,  for  the  worst  case,  which  gov¬ 
erns  the  array  performance, 

E„>2G»(l-M/Ko)'>E^.,  (6) 

where, 

[a,a2(l  -x)RhJ  '  •  ^ 

Thus  even  for  differential  operation,  the  contrast  ratio,  Kq,  is  important,  as  was  pointed  out  before  in  [7]. 
Error-free  performance  is  not  possible  unless  Kq  >  M,  and  then  E/(j  remains  large  until  Kq  »  M  (or  at 
least  Kq  >  3M).  In  a  system  application,  there  will  be  an  upper  available  limit  on  P/^  and  a  lowest  accept¬ 
able  limit  on  f,  the  data  rate  per  pixel,  setting  an  upper  limit  on  the  amount  of  laser  energy  launched  per 
pixel.  E/m,  given  by: 


=  P/(Nf)  . 


(8) 
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Figure  2.  a)  This  output  voltage  vs.  input  voltage  plot  for  a  MESFET  inverter  circuit,  illustrates  a  sharp  output  voltage 
switching  characteristic.  The  required  input  voltage  swing  contrast  for  proper  digital  output  with  this  inverter  is  1.8.  b) 
This  is  a  plot  of  the  output  voltage  versus  the  input  current  for  an  experimental  receiver  circuit  specifically  designed 
for  single-ended  operation.  The  cursor  and  marker  identify  the  threshold  current,  (essentially  no  switching  for 
<  Ifh)  and  the  switching  current,  Ig^,  (essentially  complete  switching  for  >  Ig^).  Therefore  for  proper  digital 
switching,  here  Krj  =  1.25. 

Without  a  sufficiently  large  Kq*  ^^e  allowable  x  and  M  become  small  and  the  nonuniformity  tolerance  of 
differential  operation  evaporates  because  one  must  have  E/d  ^  See  Figure  1 .  A  similar  analysis  shows 
that  the  methods  of  single-ended  operation  employed  in  many  optical  logic  devices,  using  bias  beams  or 
reference  beams,  fare  even  worse.  Is  there  an  alternative? 

The  answer  is  possibly  yes,  in  the  form  of  single-ended  operation  with  specially  designed  optoelectronic 
receiver  circuits.  These  receivers  must  exhibit  three  important  characteristics:  1)  a  sharp  output  switching 
threshold:  2)  a  quiescent  state,  for  when  no  signal  is  applied,  at  a  well  defined  distance  from  the  threshold; 
and  3)  a  mechanism  for  returning  the  receiver  to  its  quiescent  state  in  the  absence  of  light,  which  is  based 
on  a  feedback  principle  or  operates  asynchronously  with  the  signal. 

Figure  2a  depicts  the  input-output  voltage  characteristics  of  an  electronic  inverter  illustrating  at  least 
one  way  of  obtaining  a  sharp  output  switching  threshold.  If,  for  this  inverter,  one  arbitrarily  chooses  at 
+0.7 V  as  a  starting  point,  then  any  logical  “0”  inputs  can  swing  up  to  +0.2V  (at  the  cursor)  and  produce 
essentially  no  output  swing.  Logical  “1”  inputs  need  to  swing  Vj^  only  beyond  -0.2V  (at  the  marker)  to 
produce  essentially  complete  output  swing.  Making  the  assumption  that  the  input  voltage  swing,  in  a 
receiver,  can  be  made  proportional  to  optical  input  energy,  one  can  then  define  the  concept  of  required  input 
contrast  ratio  K^j,  the  minimum  necessary  input  swing  for  switching  divided  by  the  maximum  allowed 
input  swing  for  no  switching.  The  inverter  of  Fig.  2a  has  a  Krj  of  1.8.  Figure  2b  shows  the  input-output 
response  of  the  electronic  portion  of  a  receiver  circuit  designed  specifically  for  single-ended  operation.  The 
Kri  here,  defined  as  input  switching  current  divided  by  the  input  threshold  current,  turns  out  to  be  1.25. 
Figure  3  shows  the  results  of  an  experiment  where  a  FET-SEED  receiver-transmitter,  designed  for 
differential  operation,  was  actually  operated  by  a  novel  single-ended  mode  of  operation  called 
Asynchronous  Reset  On  Every  Bit  Input  Contentionless  Switching  (AROEBICS),  described  briefly  in  the 
figure  caption  and  in  [8].  For  the  receiver  in  Fig.  3  and  for  complete  single-ended  receivers  in  general,  there 
is  a  minimum  necessary  switching  energy,  Eg^,  for  bit  1  inputs  and  a  maximum  allowed  energy,  Et^,  for  bit 
0  inputs  (which  produces  essentially  no  switching). 

Such  receivers  then  have  a  required  input  contrast  ratio,  Krj,  obviously  >  1,  such  that 

KRrE.yE,^.  (9) 

Single-ended  modes  of  operation  have  two  constraints  on  E;,  namely,  E;,!  and  E;s2.  such  that: 

and  E^^2<G^2Eth-  (10) 

For  the  worst  cases,  which  govern  array  performance, 

Gj,,=:  G*  and  G^2  = 

Error  free  performance  is  not  possible  unless,  (from  eqs.  9,  10,  11) 


(11) 
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Figure  3.  a)  This  is  a  schematic  diagram  of  FET-SEED  optical  signal  receiver-transmitter,  designed  for  differential  operation,  but 
which  can  also  be  operated  by  the  single-ended  AROEBICS  procedure.  The  input  voltage  value  of  the  quiescent  state  is  determined 
by  the  voltage  on  a  forward-biased  clamping  diode,  b)  These  are  the  results  of  an  AROEBICS  experiment  performed  with  the 
circuit  at  a  100  Mbit/s  data  rate.  The  output  is  the  reflection  of  a  CW  beam  on  the  modulator  “Out  2”  and  has  a  Kq  ~  4.  “Return  to 
Zero  “  (RZ)  digital  signal  pulses  arrive  on  the  “Top”  detector  diode.  Interleaved  in  time  between  the  signal  pulses,  regularly  clocked 
optical  RZ  pulses  arrive  on  the  bottom  diode,  to  reset  the  receivers  to  the  quiescent  state,  before  each  signal  pulse.  The  Krj  is  shown 
to  be  1.68,  so  the  allowed  M  can  be  as  large  as  2,38. 

(12) 

and  E^5i  <  E/j^.  Plots  of  E^^j  and  E/s2  are  shown  in  Figure  1  for  comparison  with  E^^.  Note  that  E/j  exceeds 
E/s]  by  a  factor  of  2/(1 -M/Kq).  For  receivers  with  Krj  =  2  and  with  a  not  very  large  Kq  of  4,  a  very 
respectable  M  =  2  is  allowed  and  =  4E/si.  So,  such  a  single-ended  operation  could  be  four  times  faster 
than  the  differential  mode  or  alternatively  could  be  done  with  four  times  less  laser  power.  From  a  different 
perspective,  one  can  also  deduce  that  when  a  system  application  demands  a  high  data  rate  with  limited 
laser  power,  the  single-ended  mode  of  operation  can  have  greater  nonuniformity  tolerance  than  the  differ¬ 
ential  mode  of  operation.  On  the  other  hand,  with  a  very  relaxed  data  rate  requirement,  the  differential 
mode  will  work  with  a  lower  Kq,  by  a  factor  of  Kp],  than  the  single-ended  mode  would. 

In  summary,  a  simple  method  of  analysis,  which  quantifies  nonuniformity  tolerance  in  FET-SEED 
circuit  arrays  by  relating  it  to  easily  measurable  parameters,  leads  to  an  understanding  of  how  signal 
contrast,  switching  speed,  available  laser  power,  mode  of  operation  and  nonuniformity  tolerance  impact 
each  other  so  that  reasoned  design  choices  can  be  made  when  such  arrays  are  incorporated  into  systems. 
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Optical,  algorithmic  and  electronic  considerations  on  the 
desirable  “smartness”  of  optical  processing  pixels 


M.P.Y.  Desmulliez,  B.S.  Wherrett,  J.F.  Snowdon,  J.A.B.  Dines 

Department  of  Physics,  Heriot-Watt  University,  Riccarton  EH14  4S,  Scotland, 
U.K. 


Abstract.  The  performance  gains  associated  with  optical  computing  schemes  are  analysed  in 
terms  of  the  complexity  of  individual  processing  nodes  in  order  to  optimise  the  efficiency  of 
an  optica]  processor.  The  sorting  task  is  chosen  as  a  practical  example.  Performance  metrics 
are  discussed  for  various  component  technologies  and  electronic  node  (smart-pixel)  sizes. 


1.  Introduction 

Advances  in  the  integration  of  optoelectronic  components  with  VLSI  chips  are  now  enabling 
arrays  of  electronic  nodes  to  be  interconnected  optically  through  free  space  [1].  This  hybrid 
technology  offers  performance  gains  in  many  computing  tasks.  Little  has  been  done  to 
quantify  these  gains  or  to  compare  the  various  options  in  specific  component  technologies  or 
in  electronic  node  size  [2-4].  The  “desirable  smartness  of  a  pixel”  refers  to  the  complexity  of 
the  individual  nodes.  Complex  (very  smart)  pixels  demand  large  chip  area;  hence  only  a  few 
can  be  accommodated  per  chip  and  the  2-D  paraUehsm  offered  by  the  optical  interconnect  may 
not  be  optimally  exploited.  Conversely,  the  optics  may  not  be  able  to  cope  with  vast  numbers 
of  simple  pixels  per  chip,  or  the  resulting  high  number  of  optically  interconnected  chips 
required  to  accon^hsh  a  given  processing  task  may  lead  to  inefficiencies. 

We  choose  to  consider  the  sorting  task  because,  on  algorithmic  grounds,  there  is  a 
reduction  in  computational  time  if  implemented  with  space-variant  non-local  optical 
interconnects  [5].  Sorting  remains  also  one  of  the  main  tasks  in  computation  [6].  Also,  pixels 
of  different  complexities  have  aheady  been  designed  and  laid  out  to  perform  the  sorting. 

Three  different  classes  of  pixels  have  been  studied,  (i)  The  Symmetrical  Self- 
Electrooptic-Effect  Device  (S-SEED)  provides  the  least  intelligent  of  the  pixels  in  the  sense 
that  the  logic  fimctions  of  these  p-i-n  photodiodes  are  just  NAND  or  NOR  operations  [7]. 
(ii)  Any  logic  operation  is  a  priori  possible  with  the  logic  SEEDs  (L-SEED)  [8].  (iii)  The  last 
type  of  pixel  integrates  photoreceivers  and  modulators  with  VLSI  compatible  electronic 
circuitry.  The  integration  is  either  monohthic  (FET-SEED)  [9]  or  performed  by  fhp-chip 
bonding  the  optical  transceiver  chip  onto  a  CMOS  electronic  chip  [10].  The  voltage  gain  of 
the  first  electronic  stage  provides  both  a  substantial  reduction  in  optical  input  switching  energy 
and  a  thresholding  unit.  The  pixel  is  designed  to  operate  as  a  two-input,  two-output 
exchange/bypass  node  [11].  Each  of  these  technologies  is  considered  to  be  operated  at  its 
optimum  conditions. 
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In  the  optical  domain,  the  limitations  on  the  scaling  possibilities  originate  from 
frmdamental  properties  and  technical  constraints.  For  example,  the  aberration  and  diffraction 
effects  of  the  imaging  devices  and  the  space-variant  nature  of  the  optical  interconnections 
impose  an  upper  limit  on  the  ^ace-bandwidth-product  available  and  the  field  of  view. 
Technical  limitations  encompass  the  micro- mechamcal  alignment  and  the  stability  of  the 
optical  system.  The  amount  of  source  power  per  laser  (taken  to  be  1  watt)  restricts  the  power 
available  per  pixel,  thereby  limiting  the  switching  rate  of  the  nodes.  In  the  electronic  domain, 
the  decrease  of  the  yield  with  increasing  chip  size  demands  an  area  of  no  more  than  1  cm , 
compatible  with  the  optical  field  of  view.  In  the  thermal  domain,  it  is  assumed  that  10  Wcm' 
of  heat  can  be  dissipated  at  each  pixel  array;  this  is  achievable  with  conventional  forced  air 
cooling  methods. 

2.  Effects  of  limited  amount  of  laser  source  power,  power  dissipation  and  pixel 

complexity 

Tlie  optimum  smart  pixel  maximises  the  processing  throughput  rate  while  satisfying  the 
constraints  imposed  by  the  limited  heat  removal  capability  and  limited  laser  source  power. 
The  throughput  rate  of  the  sorter  is  defined  as  the  number  of  8-bit  words  that  can  be  sorted 
per  second  and  is  expressed  in  millions  of  (byte)  operations  per  second  (Mops/s).  On  the  sole 
assumption  of  limited  laser  power,  the  optimum  throughput  rate  is  a  trade-off  between  the 
increase  in  number  of  processing  elements  (with  increasing  chip  size)  and  the  decrease  in 
power  available  at  each  node  (figure  1).  The  optimum  throughput  rate  is  reached  for  36  x  36 
S-SEED  or  L-SEED-based  arrays.  The  dimensions  of  the  (existing)  devices  are,  respectively, 
10  |im  X  20  pm  and  55  pm  x  55  pm.  The  optical  windows  are,  respectively,  5  pm  x  10  pm 
and  5  pm  x  5  pm  with  a  capacitance  of  120aFpm'^.  The  contrast  ratio  of  the  output 
modulators  is  4;1  if  the  device  is  operated  at  maximum  absorption  when  unbiased  and  6:1 
otherwise.  The  optimum  throughput  rate  of  the  2x2  FET-SEED  node  of  dimensions 
560  pm  x  280  pm  is  220  Mops/s  and  is  obtained  for  an  array  size  of  0.85  cm^  whereas  the 
optimum  array  size  for  2x1  nodes  is  about  0.40  cm^.  The  throughput  rate  for  the  1  pm 
CMOS- SEED  is  limited  by  the  maximum  chip  size  and  not  by  the  power  available.  The 
CMOS  pixel,  of  dimension  400  pm  x  200  pm,  is  assumed  to  operate  at  a  fiequency  of 
100  MHz  whereas  the  FET-SEED  node  has  been  successfully  simulated  at  3 10  MHz.  At 
higher  frequencies,  more  power  is  needed  to  convert  the  optical  signals  into  electronic  voltage 
swings.  Tliis  explains  the  decrease  of  the  throughput  rate  at  large  chip  sizes  for  the  FET- 
SEED-based  array. 

The  power  dissipated  by  the  S-SEEDs  and  L-SEEDs  originates  from  the  photocurrent 
generated  by  the  optical  beams.  The  power- switching- time-product  of  these  devices  [7] 
suggests  that  high  frequencies  can  be  achieved  if  high  optical  power  is  available.  The 
saturation  of  the  absorption  peak  [12]  at  high  power  levels  however  limits  the  maximum 
fi  equency  of  operation  if  cascaded  an  ays  of  such  devices  are  utilised.  This  is  shown  in  figure 
2  for  the  case  of  two  S-SEED  anays.  For  this  figure  and  the  following  ones,  the  power 
dissipated  (left  vertical  axis)  and  the  required  laser  source  power  (right  vertical  axis)  per  pixel 
are  displayed  as  functions  of  the  operating  frequency  of  the  devices.  On  each  of  the  axes  is 
also  indicated  by  a  dash  the  maximum  powers  allowed  per  device  for  different  array  sizes.  The 
saturation  inadiance  of  the  SEED  is  assumed  to  be  1  kWcm A  maximum  frequency  of 
70  MHz  is  then  acliieved  for  500  |.iW  of  laser  power  at  each  node,  which  coiresponds  to 
350  ^lW  of  power  dissipated  per  device.  The  limited  laser  source  power  is  the  fundamental 


Power  dissipation/node  (mW) 
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limit  for  the  S-SEED;  1 1  Mops/s  is  achievable  for  a  128  x  128  array  (A)  whereas  12  Mops/s 

(B)  is  possible  for  a  64  x  64  array. 

The  total  power  dissipated  for  the  CMOS  and  FET-SEED  (Ptot  in  figures  3  and  4)  comes  from 
the  heat  generated  by  the  output  SEED  modulators  (Pmod  in  figure  3)  and  by  the  electronic 
circuitry  (Pdcc  in  figure  3).  The  total  input  capacitances  of  each  pixel  are  taken  to  be  60fF. 
The  frequency  depends  on  the  energy  used  to  convert  the  optical  signals  into  voltage  swings. 
Pulse  mode  of  operation  of  the  lasers  is  hence  preferred  for  high  frequency  in  order  to  provide 
the  minimum  input  energy  in  a  short  timescale  [13].  For  a  32  x  32  array  of  CMOS  nodes 
(figure  3),  the  limited  amount  of  laser  source  power  restricts  the  throughput  rate  to  88  Mops/s 

(C)  for  a  frequency  of  80  MHz.  The  effect  of  the  pixel  complexity  within  the  same  MESFET 
logic  family  can  be  seen  in  figure  4.  The  power  dissipated  is  essentially  of  electronic  nature 
because  of  the  buffered  FET  logic  (BFL)  family .  The  more  complex  node  dissipates  around 
600  mW  per  node  (sohd  line)  whereas  the  simpler  node  dissipates  only  53  mW  (dashed  line). 
This  time,  the  heat  removal  capabihty  dictates  the  maximum  number  of  nodes  possible  on  the 
chip.  In  the  first  case,  a  4  x  4  array  at  320  MFIz  produces  a  throughput  rate  of  38  Mops/s  (D) 
whereas  100  Mops/s  are  possible  for  a  16  x  16  array  at  230  MHz  (E). 

3.  Conclusion 

Optimum  processing  throughput  rates  of  an  optically  interconnected  bitonic  sorter  have  been 
calculated  for  different  pixel  complexities  and  electronic  node  sizes.  It  has  been  shown  that,  in 
systems  for  which  the  electronic  power  dissipation  of  the  pixels  is  neghgible,  the  power  of  the 
laser  source  limits  the  throughput  rate.  High  power,  highly  modulated  laser  sources  are 
desirable  in  order  to  increase  the  throughput  rate.  The  FET-SEED  pixels,  which  dissipate  a 
large  amount  of  power  must  be  kept  simple  in  order  to  approach  the  performance  of  the 
CMOS-based  node  array.  If  Directly  Connected  FET  Logic  (DCFL)  FET-SEED  nodes  are  to 
appear,  they  will  provide  an  improvement  on  CMOS- SEED  devices  for  the  same  node 
complexity. 
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Abstract.  The  design  and  implementation  of  an  optically  interconnected 
self- routing  exchange/bypass  node  array  is  described.  The  nodes  are  laid  out 
using  silicon  and  gallium  arsenide  based  technologies.  900  Mbs/s  of  through¬ 
put  rate  is  achievable  for  the  sorting  of  1024  8  bit- words  at  an  operating 
frequency  of  100  MHz. 


1.  Introduction. 

Sorting  remains  one  of  the  most  commonly  performed  tasks  in  any  computation  [1].  It 
is  also  a  task  for  which  the  use  of  a  space-variant  non-local  interconnect,  such  as  the 
perfect  shuffle,  provides  an  increase  in  performance  compared  to  a  local  interconnect 
implementation  [2].  This  increase  arises  from  the  reduction  in  the  number  of  clock  cycles 
needed  to  route  the  data  to  their  respective  addresses.  When  used  in  a  system  performing 
sorting,  the  perfect  shuffle  has  the  important  property  of  being  stage  invariant.  Only 
one  set  of  hardware  is  thus  required  to  implement  the  stages  of  a  sorting  network  if  a 
recirculating  design  is  used.  Recent  advances  in  hybrid  opto-electronics  now  allow  the 
construction  of  optical  demonstrators.  This  is  most  advantageous  since  it  was  shown 
that  the  throughput  rate  can  be  enhanced  by  two  orders  of  magnitude  if  dedicated 
hardware  in  the  form  of  smart  pixels  is  provided  to  reduce  the  computational  time  of 
the  sorting  [2].  Self- routing  exchange/bypass  nodes  have  hence  been  designed  and  laid 
out  in  semi-custom  1  fim  n-well  double-metal  CMOS  technology  and  in  AT&T  FET- 
SEED  monolithic  integration  [3].  Arrays  of  such  processing  units  are  to  be  included  in 
a  perfect-shuffle  interconnected  bitonic  sorter  [4]. 
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2,  The  bitonic  sorting  algorithm. 

The  network  described  here  is  based  on  Batcher’s  bitonic  sorting  algorithm  [5].  It  uses 
arrays  of  opto-electronic  switching  nodes,  each  of  which  compares,  processes  and  outputs 
two  optical  digital  streams  of  bits  to  the  next  array.  The  interconnection  pattern  between 
two  successive  processing  steps  can  be  implemented  by  the  optical  perfect  shuffle  (fig. 
1).  Each  node  has  three  functions  :  compare  the  two  inputs  and  either  interchange  them 
or  let  the  data  pass  straight  through  [6].  The  network  is  self- routing  in  the  sense  that 
the  control  of  the  path  is  built  into  the  logic.  In  our  iterative  system  (fig.2)  a  sequence 
of  pre-determined  space-variant  control  images  provides  the  optical  signals  that  specify 
the  functionality  of  each  node  at  each  cycle.  If  the  control  bit  is  set  to  zero,  the  output 
D  (fig.l)  is  the  maximum  of  the  two  inputs  (A,B).  If  the  control  is  set  to  the  logic  level 
1,  the  maximum  of  the  inputs  is  output  to  C.  A  second  control  bit  prevents  the  node 
from  comparing  the  data;  the  outputs  (C,D)  are  then  the  images  of  respectively  (A,B). 


3,  Logic  and  layout  design  of  the  switching  node. 

AT&T  have  developed  a  technology  that  integrates  onto  the  same  chip  S-SEEDs  and 
FET  logic  circuitry  [3].  A  4x2  array  of  2x2  switching  nodes  has  been  fabricated.  The 
high  value  of  the  scale  parameter,  the  availability  of  only  depletion  mode  FETs  and  the 
complexity  of  the  design  have  produced  a  node  of  area  560  /rm  by  280^m.  The  power 
density  of  the  chip,  estimated  at  34Wcm-2  necessitates  forced  cooling  capabilities. 
Reduced  node  (smart  pixel)  area  and  reduced  power  demands  are  achieved  using  SEED 
technology  for  input  optical  detection/output  modulation,  combined  with  CMOS  elec¬ 
tronic  circuitry.  Solder  bump  flip-chip  bonding  will  be  used  to  interface  the  two  tech¬ 
nologies  [7].  Figure  3  shows  the  silicon  test  node  which  is  implemented  in  European 
Silicon  Structures  (ES2)  1.0  /rm  CMOS  n-well  double  metal  technology  via  Eurochip. 
The  node,  of  area  370  /rm  by  230/rm,  encompasses  two  8-bit  long  shift  registers  which 
store  the  two  words  at  each  processing  step.  A  power  density  of  10  Wcm~^  at  100  MHz 
has  been  estimated  which  accounts  for  the  optical  losses  of  the  system,  the  electrical 
power  consumption,  the  power  absorbed  by  the  receivers  and  the  modulation  efficiency 
of  the  modulators. 

The  functionality  of  the  node  is  described  in  figure  4.  Two  bit-serial  words,  A  and  B, 
are  clocked  through  the  node,  most  significant  bit  first.  The  first  bit  difference  between 
the  inputs  is  detected  by  the  comparator  and  memorized  by  the  following  latch.  A  reset 
signal  has  set  this  latch  in  a  neutral  state  before  the  application  of  the  inputs.  A  control 
signal,  P,  is  used  to  establish  whether  the  minimum  ot  the  maximum  of  the  inputs  is 
to  be  output  at  C  (and  the  max/min  at  D).  Alternatively  a  bypass  signal  is  introduced 
in  order  to  over-ride  P  and  prevent  data  exchange.  The  output  latch  stage  provides 
the  state  of  the  switch,  which  controls  the  multiplexer.  The  multiplexer  then  routes  the 
input  signals  to  the  outputs  C  and  D.  The  electrical  feedback  between  the  output  stage 
and  the  comparator  ensures  that,  having  latched  as  a  consequence  of  a  bit-difference, 
the  state  of  the  latch  is  not  altered  by  subsequent  opposing  bit-differences,  until  the 
end  of  the  words.  The  remaining  word  bits  are  then  clocked  through  the  node  until  the 
application  of  the  reset,  P  and/or  bypass  signals  for  the  processing  of  the  next  words. 
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4.  Data  bottleneck  and  laser  power  requirement. 

The  main  components  of  the  sorter  are  the  input  spatial  light  modulator  (SLM),  the 
memory  array,  the  sorting  node  array  and  the  perfect  shuffle  implementation  (figure  2). 
The  SLM  is  taken  to  be  an  electrically  addressed  detector/modulator  array,  based  on 
SEED  technology,  and  flip-chip  bonded  onto  the  CMOS  electronic  circuitry  [7]. 

The  CMOS  circuitry  is  capable  of  a  100  MHz  operating  frequency,  the  sorting  of  32  by 
32  words  of  8  bits  therefore  requires  9  iis.  The  set  of  words  is  to  be  presented  as  a  32  by 
32  array  for  each  of  the  eight  bit-planes.  The  SLM  must  then  output  eight  images,  at  10 
ns  intervals,  once  every  9  fis.  Each  SLM  pixel  must  be  provided  with  at  least  8  bits  of 
memory  in  order  to  achieve  this.  32  lines  of  electronic  inputs,  each  working  at  28  MHz 
are  sufficient  to  provide  the  refresh  rate  demanded.  This  is  within  the  capabilities  of 
electronics  and  thus  there  is  no  data  bottleneck  at  the  input  SLM.  To  ensure  a  100  MHz 
operating  frequency  at  the  sorting  node,  the  incident  power  levels  must  exceed  aroung 
80  per  beam.  Taking  into  account  the  optical  losses  within  the  system,  the  SLM 
read  lasers  then  need  to  have  a  power  greater  than  1.4W,  The  low  duty  factor  and  use 
of  CMOS  drive  circuitry  means  that  the  SLM  would  not  require  special  heatsinking.  A 
potential  data  bottleneck  does  occur  however  at  the  control  SLM  since  this  modulator 
array  needs  to  be  refreshed  every  clock  cyle;  solutions  have  been  found  to  eliminate  this 
bottleneck. 


5.  Conclusion. 

Exchange/bypass  switching  nodes  have  been  designed,  fabricated  and  tested.  They  pro¬ 
vide  the  first  elements  of  a  perfect  shuffle  interconnected  bitonic  sorter.  The  next  gen¬ 
eration  of  these  chips  will  be  flip-chip  bonded  onto  S-SEED  modulators  and  receivers. 
The  bit  output  data  rate  of  32x32  words  of  8  bits  to  be  sorted  at  100  MHz  operating 
frequency  is  900  Mbs/s.  This  corresponds  to  10^^  gate-Hertz  or  10^^  pin-Hertz. 
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Abstract.  The  architecture  and  complexity  analysis  of  a  2  dimensional  array  of  smart  pixels 
which  can  be  user  programmed  to  iniplement  arbitrary  functions,  including  binary  sorting  or 
switching  nodes  or  HyperPlane  switching  nodes,  is  described.  A  large  array  where  each 
pixel  can  be  programmed  independently  may  require  a  hundred  thousand  bits  of  control 
memory .  A  programmable  architecture  which  exploits  the  symmetry  inherent  in  smart  pixel 
arrays  and  reduces  the  control  memory  requirements  by  a  few  orders  of  magnitude  is 
proposed.  A  complexity  analysis  indicates  that  mask-programmable  arrays  with  the 
equivalent  of  1 12  x  1 12  pixels  and  field  programmable  arrays  with  the  equivalent  of  64  x64 
pixels  are  achievable  on  a  1  cm^  MOS  die.  The  use  of  programmable  arrays  can  automate  the 
VLSI  mask  design  process,  can  allow  for  cost  reduction  through  larger  production  volumes, 
and  can  allow  for  standardization  in  pixel  pitch,  optical  and  opto-mechanical  packaging 
assemblies,  electronic  packaging  assemblies  and  other  supporting  technologies. 

1.  Programmable  Smart  Pixel  Array  Architectures. 

The  architecture  of  a  2  dimensional  array  of  smart  pixels  which  can  be  user  programmed  to 
implement  arbitrary  functions,  including  binary  sorting  or  switching  nodes  or  HyperPlane 
switching  nodes,  is  described.  Each  programmable  smart  pixel  consists  of  optical  10, 
electrical  10  with  neighboring  smart  pixels  and/or  bonding  pads,  flip-flops  for  internal  state 
information  and  some  type  of  programmable  logic  device  to  implement  a  finite  state 
machine.  The  programmable  logic  can  be  implemented  a  number  of  ways,  leading  to  classes 
of  programmable  smart  pixel  arrays.  A  "Mask  Programmable  Smart  Pixel  Array"  is  one 
where  each  pixel  includes  some  type  of  VLSI  mask-programmable  device,  such  as  a 
"Programmable  Logic  Array"  (PLA).  Mask-programmable  PLAs  are  programmed  at 
fabrication  time  through  the  addition  of  transistors  in  the  VLSI  masks.  A  "Field 
Programmable  Smart  Pixel  Array"  is  one  where  each  pixel  can  be  user  programmed  in  the 
field  to  implement  arbitrary  logic  functions.  The  function  of  each  pixel  can  be  programmed 
by  down-loading  a  bit-stream  into  an  on-chip  control  memory,  the  technique  used  in  the 
smart  pixel  arrays  for  the  HyperPlane  photonic  backplane  [5].  The  dynamic  programmability 
can  also  be  achieved  using  RAM  or  ROM  based  lookup  tables,  or  by  adapting  a  mask- 
programmable  PLA  to  become  field  programmable  (the  approach  taken  in  this  paper). 

2.  Reducing  Memory  Requirements  in  Field  Programmable  Smart  Pixel  Arrays. 

A  basic  cell  for  a  "Programmable  Smart  Pixel  Array"  is  shown  in  fig.  la.  Each  cell  has  four 
sides,  N,  S,  E,  W,  and  each  side  contains  one  or  more  electronic  l~bit  wide  10  ports,  i.e., 
In.N.  l  denotes  the  input  port  on  the  North  side  with  index  1.  One  internal  structure  of  the  cell 
is  shown  in  fig.  lb.  The  internal  finite  state  machine  may  consist  of  optical  and  electrical  10, 
a  combinational  PLA,  followed  by  a  set  of  D-Flip  Flops  with  feedback.  Each  cell  also  has 
one  or  more  global  clock  and  reset  inputs  which  are  not  shown  in  the  figure.  These  cells  can 
be  stacked  into  2  dimensional  arrays.  The  ratio  of  electrical  to  optical  10  bandwidth  on  the 
integrated  circuit  can  be  adjusted  be  varying  the  placement  and  number  of  electrical 
connections  to  lO  bonding  pads  or  "pin-outs".  ICs  with  a  small  number  of  electrical  pin-outs 
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Figure  I.  (a)  VLSI  10  of  a  basic  cell,  (b)  Internal  structure  of  a  basic  cell. 


may  provide  connections  to  pins  only  to  cells  at  the  perimeter  of  the  smart  pixel  array 
through  the  N,S,E,W\0  ports;  ICs  with  many  electrical  pin-outs  may  also  provide  every  cell 
with  connections  to  pins,  including  cells  internal  to  the  2D  array.  In  a  mask-programmable 
PLA  the  logic  functions  to  be  implemented  by  the  cells  are  known  at  fabrication  time  and  are 
implemented  by  customizing  a  predefined  VLSI  mask  by  inserting  transistors  between 
appropriate  rows  and  columns  in  a  PLA.  Every  cell  can  contain  a  distinct  logic  function  since 
the  PLA  in  every  cell  can  be  customized  without  incurring  any  VLSI  area  overhead.  In  mask- 
programmable  devices  the  control  memory  shown  in  fig.  lb  for  programming  the  PLA  is  not 
necessary.  In  a  field-programmable  array  in  which  cells  are  individually  programmable, 
every  cell  will  require  a  control  memory  to  store  the  logic  function  for  its  PLA  as  shown  in 
fig.  lb. 

A  generic  model  of  a  mask-programmable  PLA  is  shown  in  figure  2.  In  the  And-array 
every  horizontal  line  represents  one  logic  "product"  term,  i.e.,  the  logical  AND  of  all  the 
inputs  to  which  it  is  connected  through  a  closed  connection.  In  the  Or-array  every  vertical 
line  represents  a  "sum "  term,  i.e.,  the  logical  OR  of  all  product  terms  to  which  it  is  connected 
through  a  closed  connection.  A  search  of  the  literature  did  not  reveal  any  references  to 
architectures  for  field  programmable  PLAs  and  hence  a  generic  model  is  proposed.  In  a  field 
programmable  PLA  a  "programmable  connection"  is  inserted  at  every  intersection.  Each 
programmable  connection  has  2  states,  "openned"  and  "closed",  and  is  controlled  by  a  bit  of 
control  memory  stored  on  or  off  the  die.  Closed  connections  are  indicated  by  the  bold  boxes 
in  fig.  2.  Field  programmable  PLAs  can  be  implemented  in  various  technologies.  In  the  GaAs 
FET-SEED  technology  one  implementation  of  a  programmable  connection  is  a  dual-gate 
FET  transisor,  implementing  the  logical  AND  function.  Other  implementations  may  chose  to 
control  the  terminal  voltages  of  a  MOSFET  to  switch  it  between  openned  and  closed  states  or 
may  operate  two  transistors  in  series. 

A  simple  individually  programmable  cell  may  contain  one  optical  input,  one  optical 
output,  one  electrical  lO  per  side  and  one  bit  of  memory  for  the  finite  state  machine.  Such  a 
cell  would  require  a  small  PLA  with  6  inputs  and  6  outputs.  If  the  PLA  had  a  maximum 
capacity  of  4  product  terms,  then  72  programmable  connections  per  cell  would  be  required. 
Each  programmable  connection  requires  only  one  or  two  transistors  which  represents  a 
relatively  small  VLSI  overhead.  However,  each  programmable  connection  also  requires  one 
bit  of  control  memory  so  that  even  a  simple  cell  would  require  72  bits  of  control  memory.  A 
32-by-32  array  of  individually  programmable  cells  would  thus  require  73,728  bits  of  control 
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Figure  II.  General  model  of  a  PLA. 


memory.  There  are  three  basic  approaches  to  handle  the  control  memory  issue.  If  the  control 
memory  is  to  be  stored  on-chip  then  a  large  number  of  transistors  will  be  required  to 
implement  the  memory.  If  the  control  memory  is  stored  off-chip  and  the  control  data  is 
supplied  through  electronic  pins,  then  a  large  number  of  electrical  pins  will  be  required.  If  the 
control  memory  is  stored  off-chip  and  the  data  is  supplied  on  optical  input  windows,  then  a 
large  number  of  optical  input  windows  will  be  required.  In  all  three  cases  the  large  size  of  the 
control  memory  incurs  a  cost.  This  control  memory  complexity  must  be  reduced  in  order  to 
yield  architectures  which  can  be  manufactured  with  existing  smart  pixel  technologies. 

The  simplicity  and  regularity  of  smart  pixel  arrays  can  be  exploited  to  yield  a  field 
programmable  architecture  with  a  moderate  control  memory  complexity.  The  programmable 
cells  can  be  grouped  into  "sectors"  where  the  logic  functions  of  the  cells  in  each  sector  are 
identical.  The  control  memory  used  to  program  the  cells  in  each  sector  can  be  shared, 
resulting  in  a  significant  reduction  in  the  size  of  the  control  memory.  By  exploiting  the 
symmetry  in  a  32x32  array  of  cells  the  control  memory  can  be  reduced  in  size  by  three 
orders  of  magnitude.  A  powerful  programmable  architecture  may  support  a  few  sectors  on  a 
die,  where  each  sector  need  not  occupy  a  contigous  block  of  cells.  In  the  HyperPlane  smart 
pixel  arrays  [5]  a  row  of  pixels  may  operate  on  parallel  bits  of  data,  where  the  most 
significant  pixel  in  the  row  makes  routing  decisions  that  all  other  pixels  in  the  row  follow .  In 
this  case  the  2D  array  should  contain  2  sectors,  one  for  most  significant  pixels  and  one  for  all 
other  pixels. 

3.  VLSI  Complexity  Analysis. 

Let  I,  O,  P  denote  the  number  of  inputs,  outputs,  product  terms  in  a  PLA.  Let  E,  L  and  D 
denote  the  number  of  electrical  10,  the  number  of  optical  10  and  the  number  of  flip-flops 
associated  with  the  finite  state  machine  of  a  programmable  cell.  The  PLA  therefore  has 
I=0=(E+L+D).  Each  logical  input  signal  of  a  mask  programmable  PLA  in  CMOS  typically 
requires  14  A  width.  Each  pair  of  output  signals  adds  another  14  A  width.  Each  pair  of 
product  terms  adds  10  A  height.  The  pullup  transistors  in  the  And-array  add  20  A  width  and 
the  pullups  in  the  Or-array  add  20  A  height.  A  mask-programmable  PLA  will  require  = 
(1+0)^14+20  A  width  and  (P/2)*10+20  A  height.  Each  D-flip  flop  requires  ==  5  NOR  gates 
with  ~  20  x20  A^  area  per  gate.  Each  unidirectional  optical  10  port  requires  ~  25  x25  A^. 
The  total  area  of  a  mask  programmable  cell  and  the  number  of  cells  per  side  X  in  a  1  on  2  die 
are  approximately  given  by 

cell -area -^{(l  +  O)^H  +  20)-{(P/ 2)*  \0  +  20)  + 0-2,000  + L-1250. 


X- 10,000 /V^  -  area . 
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Table  1.  Results  for  various  programmable  smart  pixel  arrays. 


The  simplest  approach  to  add  dynamic  programmability  is  to  insert  programmable 
connections  in  the  PLA  and  share  the  control  memory  using  sectors  of  functionally  identical 
pixels.  In  this  case  the  width  of  the  PLA  must  be  increased  to  allow  the  control  wires  to  enter 
and  exit  the  PLA.  In  general,  each  control  wire  adds  ~  3  A  width  and  3  A  spacing  to  the 
width  of  the  cell.  For  the  analysis  assume  that  all  pixels  belong  to  the  same  sector  and  that  all 
control  wires  run  vertically.  There  will  be  (2I-¥0)P  control  wires  running  vertically.  Assume 
M  layers  of  metalization  are  used  to  route  control  wires  so  that  the  increase  in  width  is 
~6(2I+0)P/M.  Assuming  the  control  memory  is  stored  on-chip  then  (2I-^0)P  D-flip  flops 
will  be  needed  to  implement  the  control  memory.  The  total  area  of  a  field  programmable  cell 
and  the  number  of  cells  per  side  X  in  a  1  cm  ^  die  are  approximately  given  by 

cell  -  area -- {(I  +  O)^  14 -^20  + 6(21  +  0)P  /  M)-{{P/ 2)^  10  +  20)  + D  •2,000  +  1-1250 
X  =  (10,000 - ^J{21  +  0)*P*2000)  /  -  area . 

Table  1  lists  the  results  of  the  analysis.  The  first  two  rows  represent  mask-programmable 
smart  pixel  arrays  and  the  last  two  rows  represent  field  programmable  smart  pixel  arrays. 
Mask-programmable  arrays  with  ~  111x111  pixels  are  achieveable  through  a  variety  of 
organizations,  with  simple  cells  with  one  pixel  per  cell  (row  1)  or  with  more  complex  cells 
with  4  pixels  per  cell  (row  2).  Field  programmable  arrays  with  ~  64  x  64  pixels  are  also 
achievable.  Finally,  we  note  a  few  points.  Layout  efficiencies  may  increase  or  decrease  the 
pixel  capacity  by  =  25  %  of  the  computed  values.  The  analysis  indicates  that  simpler  cells 
with  smaller  PLAs  tend  to  be  more  cost  effective  than  complex  cells  with  larger  PLAs.  For 
sufficiently  large  PLAs  the  area  taken  by  the  control  wires  becomes  a  dominant  term  and  it 
will  be  more  effective  to  place  a  control  bit  of  memory  at  every  programmable  connection, 
yielding  an  architecture  with  individually  programmable  cells. 

4.  Conclusions. 

The  masks  for  a  field  programmable  smart  pixel  array  based  upon  a  CMOS  array  of 
field  programmable  logic  devices  are  currently  being  designed  for  proof-of-concept 
validation.  The  masks  for  a  simpler  device  in  the  GaAS  FET-SEED  technology  are  also 
being  designed.  The  CMOS  die  will  be  fabricated  using  the  Canadian  Microelectronics 
Corporation  (CMC)  fabrication  runs  and  should  be  available  by  1995.  The  developments  of  a 
field  programmable  smart  pixel  array  would  accelerate  a  movement  towards  standardization. 
Standardization  of  pixel  dimensions  should  promote  the  standardization  of  support  systems, 
baseplates,  packaging  assemblies  and  pin-outs. 

[1]  K.W.  Goosen,  J.E.  Cunningham,  W.Y.  Jan,  IEEE  Phot.  Tech.  Lett.,  5,  776  (1993). 

[2]  T.K.  Woodward,  L.M.F.  Chirovsky,  R.A.  Novotny,  A.L.  Lentine,  "Single-Ended  Oper¬ 
ation  of  FET-SEED  Smart  Pixels",  lEEE/LEOS  Summer  Topical  Meeting  on  Smart 
Pixels  -  94,  pp.  24-25. 

[3]  M.P.Y.  Desmullies,  F.A.P.  Tooley,  J.G.  Crowder,  N.L.  Grant,  B.S.  Wherrett,  "Optically 
Interconnected  exchange/bypass  self-routing  node  arrays:  logic  and  design  layout", 
lEEE/LEOS  Summer  Topical  Meeting  on  Smart  Pixels  -  94,  pp.  32-33. 

[4]  I.  Underwood,  D.G.  Vass,  M.W.G.  Snook,  W.J.  Hossack,  L.B.  Chua,  "Smart  Pixels  using 
Liquid-crystal-over-silicon",  Summer  Topical  Meeting  on  Smart  Pixels  -  94,  pp.  66-67. 

[5]  T.H.  Szymanski  and  H.S.  Hinton,  "A  Smart  Pixel  Design  for  a  Free-Space  Optical 
Backplane",  Summer  Topical  Meeting  on  Smart  Pixels  -  94,  pp.  85-86. 

[6]  C.  Mead  and  L.  Conway,  "Introduction  to  VLSI  Systems",  Addison-Wesley. 


I 


Inst.  Phys.  Conf.  Sen  No  139:  Part  V 

Paper  presented  at  Opt.  Comput.  Int.  Conf.,  Edinburgh,  22-25  August  1994 
©  1995  lOP  Publishing  Ltd 


501 


The  Scottish  Collaborative  Initiative  on  Optolectronic  Sciences  (SCIOS)  - 
Devices  and  Demonstrators  for  Free-Space  Digital  Optical  Processing 


A.C.  Walker,  DJ.  Goodwill,  B.S.  Ryvkin,  *M.  McElhinney,  *F.  Pottier, 
*B.  Vogel,  *M.C.  Holland  and  *C.R.  Stanley 

Department  of  Physics,  Heriot-Watt  University,  Edinburgh  EH14  4 AS,  U.K. 

♦Department  of  Electrical  and  Electronic  Engineering,  University  of  Glasgow,  U.K. 


Abstract.  A  novel  SEED-type  device  based  on  MQW  GaAs/GaAlAs  has  been 
demonstrated.  In  addition,  16  x  16  arrays  of  strained-InGaAs/GaAs  S-SEEDs 
have  been  successfully  operated  with  a  diode-pumped  1064  nm  laser.  These 
latter  devices  can  be  combined  with  CMOS  electronics  to  create  powerful 
smart  pixel  arrays.  The  SCIOS  consortium  is  developing  digital  parallel  optical 
processing  demonstrators  based  on  this  technology. 


1.  Introduction 

The  Scottish  Collaborative  Initiative  on  Optoelectronic  Sciences  (SCIOS)  is  a  partnership 
between  Heriot-Watt,  Glasgow,  St.  Andrews  and  Edinburgh  Universities  that  has  been 
running  since  1990.  It  has  its  objectives:  (i)  to  pursue  research  into  the  physics,  materials 
science,  fabrication  and  packaging  technologies  underlying  the  development  of  optoelectronic 
devices,  and  (ii)  to  combine  the  results  of  (i)  with  new  system  architecture  concepts  so  as  to 
realise  free-space  parallel  optical  computing  and  information  processing  demonstrators 
experimentally.  The  SCIOS  technology  programme  includes  work  on  III-V  semiconductor 
devices,  diffractive  optics,  spatial  light  modulators  and  solid-state  lasers.  Recent  emphasis  has 
been  on  smart  pixel  arrays  including  hybrid  silicon/InGaAs  devices.  In  addition,  new  digital 
free-space  system  architectures  are  being  explored  and  a  range  of  novel  demonstrators 
constructed. 

This  paper  describes  recent  progress  in  the  development  of  digital  optical  switch  arrays 
based  on  III-V  semiconductor  multiple  quantum  well  (MQW)  modulator  structures  and  the 
way  they  can  be  exploited  to  construct  parallel  optical  processing  demonstrator  systems. 
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2.  Vertically-Integrated  Detector-Modulators  (VIDeMs) 

SEED  arrays  [1]  operating  at  850  nm  have  proven  to  be  effective  as  digital  photonic  logic 
planes  in  experiments  on  parallel  optical  processing  and  switching  systems  [2,3].  However, 
system  speed  can  be  limited  by  the  degradation  of  SEED  performance  at  the  required  high 
optical  power  [4,5]  due  to  local  heating,  saturation  and  field  screening.  Woodward  et  al  [6,7] 
showed  that  both  the  Joule  heating  associated  with  the  photoresponse  and  the  degradation  in 
performance  at  high  irradiance  could  be  avoided  by  using  material  with  a  radically  decreased 
non-radiative  carrier  lifetime  x  in  the  electroabsorptive  region.  However,  the  ion-implantation 
method,  used  in  refs.  6  and  7  to  reduce  the  carrier  lifetimes,  broadens  the  exciton  absoiption 
peak  and  reduces  the  modulation  performance.  We  have  made  MQW  modulator  structures 
[8]  in  which  the  carrier  lifetime  is  already  short  in  the  as-grown  structure,  so  that  the  finished 
device  had  both  low  photocuirent  and  good  modulation  performance.  We  have  gone  on  to 
demonstrate  how  such  MQW  material  may  be  used  in  a  novel  electroabsorptive  optical 
switching  device. 

This  device  can  be  described  as  a  vertically  integrated  detector-modulator  (VDDeM),  in 
which  the  two  elements  are  connected  electrically  in  parallel.  The  non-QW  GaAlAs  detector 
(lower  part)  has  unity  quantum  efficiency  under  reverse  bias.  The  modulator  diode  (top  part) 
consists  of  deep  GaAs/AlAs  quantum  wells  and  has  extremely  low  quantum  efficiency 
(q=0.4%  at  15V,  ti=1.4%  at  25V)  due  to  the  long  sweep-out  time  across  tiie  high  barriers, 
coupled  with  a  short  non-radiative  lifetime  x.  In  this  case,  the  low  value  of  x  was,  we  believe, 
produced  by  low-level  impurities  in  the  MBE  system  at  the  time  of  growth,  which  fortunately 
did  not  broaden  the  exciton  absorption  feature.  We  hope  to  reproduce  this  effect  controUably 
by  growing  at  much  lower  temperatures. 

851  nm  radiation  is  incident  on  the  modulator  and,  depending  on  its  transmission  state, 
a  portion  passes  through  to  the  detector.  The  quantum  confined  Stark  effect  in  the  MQW 
modulator  and  the  Franz-Keldysh  effect  in  the  detector  combine  to  give  N-type  negative 
resistance,  leading  to  a  factor  of  3  decrease  in  the  responsivity  of  the  whole  structure  between 
OV  and  15V  for  lOjiW  incident  optical  power.  This  behaviour  is  maintained  at  intensities  8 
times  higher  than  for  a  conventional  MQW  SEED  [4,5].  To  show  how  this  structure  can  be 
used  to  make  a  high  performance  reflection-mode  SEED-type  switch,  we  have  recently  made 
devices  with  optical  output  provided  by  a  partial  mirror  (70%  reflectivity)  between  the 
modulator  and  detector.  A  contrast  ratio  of  4  and  a  high  state  reflectivity  of  40%  arc 
expected.  Control  of  photo-response  is  likely  to  be  an  important  factor  in  the  design  of  future 
smart-pixel  devices  based  on  semiconductor  modulators. 

3.  Strained-InGaAs  SEEDs  for  1064  nm 

The  clock-rates  for  demonstration  parallel  digital  optical  processing  systems  based  on  current- 
generation  SEED  arrays,  operating  at  850  nm  wavelength,  arc  limited  by  the  diode  laser 
power  reaching  the  device  array  (milliwatts)  The  consequent  demand  for  a  better  match  to 
higher-power  (>  1  watt)  lasers,  such  as  diode-pumped  Nd:YLF  and  NdiYAG,  has  prompted 
the  development  of  strained  InGaAs/GaAs  MQW  structures  grown  by  molecular  beam  epitaxy 
(MBE)  in  the  form  of  SEEDs  operating  at  1047-1064  nm.  Previously  reported  devices  in  this 
materials  system  [9,10,11]  had  insufficient  contrast  ratio  and  too  high  operating  voltage  for 
use  in  switching  systems.  We  have  obtained  enhanced  performance,  compared  to  our 
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previously  reported  device  [11],  by  improved  growth  techniques,  and  have  demonstrated  for 
the  first  time  operation  of  arrays  of  SEEDs  at  1064  nm. 

In  our  S-SEED,  the  pin  MQW  modulator  is  grown  strain-balanced  to  a  deliberately 
relaxed  buffer  layer  of  intermediate  lattice  constant,  such  that  the  InGaAs  wells  are  in 
compression  and  balance  the  GaAs  barriers  in  tension.  Growth  was  by  MBE  on  a  [100] 
orientated  semi-insulating  GaAs  substrate.  To  minimise  the  gliding  of  dislocations,  and  hence 
avoid  broadening  of  the  excitonic  absorption  peak,  a  low  temperature  was  used  with  a  low 
arsenic: group-III  flux  ratio  to  keep  the  mobility  of  the  group-III  atoms  high.  The  buffer  layer 
consisted  of  an  i-InGaAs  layer  with  10-step  grading  of  In  composition  from  0  to  13.5%, 
followed  by  an  t-InAlAs/GaAs  superlattice.  The  structure  was  designed  to  be  completely 
relaxed  at  the  top  of  the  graded  InGaAs  layer,  but  X-ray  diffraction  measurements  showed  it 
was  only  71%  relaxed.  The  room  temperature  absorption  peak  occurred  at  1061  nm  rather 
than  the  design  wavelength  of  1047  nm.  However,  the  variation  in  this  wavelength  was  only 
±0.2  nm  across  the  central  25  mm  of  the  50  mm  diameter  substrate.  The  half-width  at  half 
maximum  on  the  low  energy  side  of  this  peak  was  6.0  meV  (5.4  nm),  which  is  the  best 
reported  for  this  system.  It  is  comparable  to  the  figures  of  6.8  meV  for  MOCVD-grown 
InGaAs/GaAsP  [12]  and  5.25  meV  for  gas-source  MBE-grown  in  InGaAs/InGaP  [13],  for 
both  of  which  the  MQWs  are  strain-balanced  to  the  substrate.  It  also  compares  well  to 
4.5  meV  for  similar  GaAs/GaAl As  MQWs  [1]. 

For  initial  optical  testing  with  a  tunable  laser,  200  p.m  diameter  stmctures  were 
fabricated  by  wet  etching  and  metallisation.  The  front  of  the  sample  had  a  partial  anti- 
reflection  coating,  the  back  was  polished  but  uncoated.  The  photocuirent  characteristics 
showed  98%  quantum  efficiency  at  0  V,  100%  at  1  V  reverse  bias.  The  transmission-voltage 
performance  shows  a  contrast  ratio  of  2.0  between  0  and  10  V  at  1060  nm,  which  increases  to 
4.2  by  the  incorporation  of  a  mirror. 

S-SEED  arrays,  ranging  in  size  from  16  x  16  (20  |am  x  20  jxm  windows)  to  48  x  96 
(9216  diodes  with  7  |xm  windows),  have  been  fabricated.  The  devices  incorporate  a  metal 
mirror  on  the  top  for  reflection-mode  operation,  with  the  light  passing  through  both  the  GaAs 
substrate  and  the  temperature  stabilised  sapphire  mount.  Anti-reflection  coatings  have  been 
applied  at  all  interfaces.  The  whole  package  is  compatible  with  our  optomechanical  system 
used  for  optical  processing  systems  experiments  [2].  Bistable  switching  with  3.0  -  4.2  contrast 
ratio  at  10  V  bias  has  been  demonstrated  for  wavelengths  1054  -  1064  nm. 

4.  InGaAs/CMOS  Smart  Pixels 

InGaAs  SEED  arrays  are  well  suited  to  flip-chip  assembly  with  a  silicon  electronic  chip 
because  of  the  substrate  (GaAs)  transparency  at  the  operating  wavelength.  Thus  smart-pixel 
arrays  can  be  created  with  each  pixel  comprising:  a  differential  detector  InGaAs  input  stage, 
CMOS  logic,  and  a  differential  InGaAs  modulator  output  stage.  A  4  x  2  smart-pixel  array, 
capable  of  making  numerical  comparisons  of  two  optical  (bit-serial)  inputs  and  directing  them 
to  specified  outputs,  has  been  designed  as  the  first  step  towards  the  larger  (32  x  32)  array 
required  for  a  sorting  module  demonstrator  (see  next  section).  The  InGaAs  opto-chip,  which 
includes  two  diode-clamped  differential  inputs  and  two  differential  (modulator)  outputs  per 
node  has  been  successfully  fabricated. 
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The  same  approach  can  be  taken  to  construct  a  high-speed  spatial  light  modulator 
capable  of  providing  parallel  optical  inputs  to  a  processor  at  ~  100  MHz  data  rate  per  channel. 
In  this  case  the  device  is  electrically  addressed,  via  a  serial  to  parallel  interface,  and  cmly 
output  modulators  are  required  on  the  opto-chip.  We  have  used  the  strained-InGaAs 
technology  to  fabricate  a  16  x  16  array  of  differential  modules.  This  has  been  flip-chip 
assembled  (solder-bump  method)  onto  a  CMOS  chip  that  was  originally  designed  as  the  driver 
for  a  liquid-crystal  SLM  [14]. 

5.  An  Optical  Sorting  Module 

The  SCIOS  consortium  is  currently  constructing  a  digital  optical  sorting  module  based  on 
InGaAs/CMOS  smart  pixel  arrays,  powered  by  diode-pumped  NdrYLF  lasers  and 
interconnected  by  a  free-space  optical  perfect  shuffle.  The  array  size  is  32  x  32  and  thus  it  will 
be  capable  of  doing  a  numerical  sort  on  1024  input  words.  The  target  specification  is  for  a  full 
sort,  using  8-bit  words,  to  be  completed  every  10  |is.  This  requires:  an  input  smart  pixel 
array  operating  as  an  electrically-addressed  SLM  and  providing  8  binary  32  x  32  images  in 
80  ns  (100  MHz)  once  every  10|is;  a  matched  smart-pixel  shift  register  to  act  as  the 
intermediate  store;  and  a  smart-pixel  sorting  node  array  capable  of  carrying  out  the  compare 
and  exchange  operations  demanded  by  the  algorithm  on  the  10  ns  per  bit-plane  timescale 
(100  MHz).  Calculations  of  optical  input  requirements  indicate  that  1-1.5  watts  of  laser 
power  are  required  per  array. 

Production  of  the  optoelectronic,  optical  and  laser  components  required  for  this  system 
is  in  an  advanced  state.  Comparisons  with  electronic  parallel  sorting  hardware,  which  is 
restricted  to  local  interconnections  between  nodes,  indicate  that  this  optical  approach  should 
be  significantly  faster. 
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work  was  supported  by  the  UK  Science  and  Engineering  Research  Council  under  the  Scottish 
Collaborative  Initiative  on  Optoelectronic  Sciences  (SCIOS).  BSR  acknowledges  the  support 
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Smart  Pixels  with  VCSELs:  Potential  and  Demonstration 
System 
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Abstract.  A  selfrouting  switching  network  using  VCSELs,  microoptical  array  components  and 
an  oplo-ASlC  has  been  demonstrated.  The  setup  of  one  entire  stage  with  40  channels  is  only 
50mm  long.  The  frame  rate  is  2  MHz,  limited  by  the  matrix-addressing  of  the  VCSELs.  A  7 
stage  8  channel  selfrouting  network  has  been  mapped  onto  the  setup  and  bit  error  measurements 
at  closed  loop  operation  including  re-addressing  showed  a  BER  better  than  10'^°.  Model  calcula¬ 
tions  show  that  on  the  order  of  lO'"  instructions  per  second  can  be  achieved  with  such  systems. 

1.  Introduction 

In  the  last  few  years,  progress  had  been  made  in  building  vertical  cavity  surface  emitting 
laserdiode  (VCSEL)  arrays  [I],  microoptical  elements  like  microlens  arrays  [2]  and  small 
hologram  facets  [3],  and  the  integration  ojf  photodiodes,  analog  and  digital  electronics  on  silicon 
chips  [4],  Using  these  devices  allows  for  setting  up  small  and  powerful  smart  pixel  systems, 
besides  the  approach  using  fast  modulators  like  SEED  arrays  [5], 

In  this  paper,  we  evaluate  the  potential  of  the  hardware  with  a  simplified  model.  A  setup 
for  a  selfrouting  switching  network  has  been  successfully  demonstrated.  The  realized  40-channel- 
stage  is  only  50mm  long. 

2.  The  smart  pixel  model 

The  power  of  smart  pixel  systems  does  not  primarily  arise  from  fast  data  processing  but  from 

parallel  processing  (in  the  data  planes)  and  pipelining  (in  the  data  flow  direction).  Therefore,  a 

system  concept  has  to  enable  "large" 

systems  with  many  pixels  in  each  data 

plane  and  with  many  data  planes.  It  is 

mandatory  to  choose  a  modular, 

hierarchical  system,  where  (a)  each 

module  can  be  preadjusted  and  (b)  many 

modules  can  be  arranged  together  with 

simple  (final)  adjustment.  This  implies  a 

specified  interface  between  the  modules 

(fig.  1),  which  might  be  collimated  beams 

with  a  pitch  of  250pm.  For  example,  for  a 

shuffle/exchange  switching  network,  only 

two  types  of  modules  are  necessary: 

(a)  the  switching  module.  It  provides 

electronic  switching  and  selfi'outing  logic  ,  ,,  ,  , 

°  °  Fig.  1:  Modular  system. 
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and  the  optoelectronic  interfaces.  Since  the  packaging  in  the  suggested  model  is  based  on  a  hybrid 
back-to-back  mounted  concept  (GaAs  for  the  VCSEl.s,  Si  for  the  receiver  and  logic),  some 
mechanical  "elasticity"  can  be  implemented  e.g.  where  electrical  bonding  wires  are  located.  Flip 
chip  mounting  is  therfore  not  recommended  (besides,  thermal  problems  are  more  difficult  to 
solve).  Each  module  might  have  8*8  channels  and  they  can  be  stacked  in  each  data  plane, 

(b)  the  permutation  module,  which  is  ^.gloha!  optical  interconnection.  It  is  identical  for  all  stages. 

If  the  system  shall  be  enhanced,  e  g.  towards  more  channels  of  the  shuffle/exchange 
network,  both  more  pixels  in  each  plane  and  more  planes  are  required.  This  can  be  achieved  by 
adding  more  of  the  (same)  switching  modules,  however,  new  permutation  modules  have  to  be 
introduced. 

Other  applications,  like  3D  computer  architectures,  bit  algorithms  etc.  are  discussed  in 
another  paper  [6]. 

3.  Limitations 

Three  basic  effects  limit  the  performance  of  smart  pixel  systems  [7,8]: 

(a)  the  optical  system,  i.e.  diffraction  and  crosstalk  (due  to  stray  light,  reflections,  unwanted 
diffraction  orders  etc.)  limits  the  channel  density  (since  only  a  certain  amount  of  crosstalk  is 
tolerable); 

(b)  the  thermal  power  which  can  be  removed  per  unit  area  from  the  system  limits  the  data 
throughput  per  unit  area  and  the  connectivity; 

(c)  the  available  area  per  channel  limits  the  functionality  of  each  pixel. 

Besides  these  three  effects,  which  are  linked,  tolerances  e.g.  of  the  mechanical  alignment,  have  to 
be  considered. 

We  made  calculations  for  the  following  (ideal)  system: 

Space  variant  optical  interconnections,  any  possible  connection  between  two  adjacent  pixelplanes 
should  be  posssible,  the  max.  beam  deflection  angel  is  45"  and  the  max.  F  number  of  the 
microlenses  is  F#  =  1 . 

Thus,  the  distance  between  two  adjacent  planes  is  a/2  a  where  a  is  the  lateral  dimension  of  a 
pixelplane.  Due  to  diffraction  limit,  the  lateral  dimension  D  of  one  optical  channel  (i.e.  microlens, 
hologram  facet  etc.)  is  [7] 

The  pitch  (and  the  available  area)  of  a  single  pixel  with  one  optical  channel  attached  (which  we 
called  elementary  pixel)  is  thus  given.  For  a  pixelplane  with  a=50mm,  the  lateral  dimension  of  an 
elementary  pixel  is  D  350pm  (with  I  =  850  nm)  or,  approx,  n  =  20.000  elementary  pixels  per 
plane  are  possible. 

To  estimate  the  functionality  of  each  pixel,  an  advanced  technology,  like  it  is  involved  at 

today’s  microprocessors  like  Intel’s  Pentium,  was  taken.  It  provides  approx.  10  transistors/mm  . 

A  8  bit  macropixel,  consisting  of  8  elementary  pixels,  can  have  on  the  order  of  10  transistors.  (A 
two  input  gate  needs  4  transistors,  one  bit  static  RAM  memory  can  be  realized  with  s2 
transistors.)  The  complexity  of  one  macropixel  corresponds  to  0.1  IPS  (instruction  per  second) 
per  clock  cycle,  e.g.  a  32  bit  adding  can  be  carried  out  in  10  clock  cycles.  Here,  the  same 
assumptions  as  in  chapter  2  had  been  made  and  that  the  area  for  the  photodiode,  preamplifier  etc. 
is  negligible. 
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Taking  a  system  with  a  clock  rate  c  of  100  MHz  and  m  -  100  data  planes,  the  overall 
performance  is: 

8  Hz 

The  heat  which  would  have  to  be  removed  is  approx.  8W/cm^  (liquid  cooling)  (derived  from 
Pentium  data). 

All  above  calculations  are  very  rough  estimations,  since  any  constraints  for  mechanical 
mounting,  power  supply  rails,  cooling,  misalignment  etc.  are  not  considered.  Besides,  it  is  critical 
to  compare  a  SIMD-like  architecture  with  commonly  used  MIMD  architectures.  So,  the  results 
here  can  only  be  considered  as  an  estimation  of  the  order  of  magnitude. 

4.  Experiment 

To  demonstrate  that  small  systems  can 
already  today  been  set  up,  in  our  lab  one 
stage  of  a  selfrouting  switching  network 
has  been  set  up  (fig. 2, 3).  It  consists  of  a  10 
*  10  VC  SET  array  (provided  by  R.  A. 

Morgan),  a  microlens  array  to  collimate  the 
beams,  a  Kepler  telescope  (because  the 
pitches  of  the  VCSELs  and  the  receivers 
are  different),  the  holographic  permutation 
element  using  close  cascade  technique, 
another  microlens  array  and  an  opto-ASIC 
with  the  photodiodes  and  the  logic. 

Microlenses,  holograms  and  ASIC  had 
been  designed  /  manufactured  at  our  Fig.  2.  Photo  of  the  experimental  setup, 
institute.  Due  to  problems  with  the  drivers  for  the  matrix-addressed  VCSELs,  the  channel  data 
rate  was  limited  to  2  MBit/(s*channel). 

A  7  stage,  8  channel  selfrouting  sorting  network,  which  implemented  Batcher  s  algorithm,  has 
been  mapped  onto  the  setup  (fig.  4).  Since  only  one  VCSEL  array  was  available,  the  input  data 
was  fed  onto  a  discrete  laser  diode  array.  A  2-D  fiber  bundle  with  different  pitches  on  each  side 
[9]  (shape  converter)  conducts  the  light  onto  the  first  opto-ASIC  (first  stage  with  electronic 
exchange/bypass  switches).  The  shuffle  of  the  second  stage  had  to  be  wired  electrically,  because 
only  five  stages  (stage  #3  to  stage  #7)  are  available  in  the  miniaturized  setup  (fig.  2). 

The  average  light  power  fed  into  the  setup  of  each  differentially  coded  channel  was  1 1.2|.iW,  the 
optical  losses  (most  due  to  ffesnel  reflections)  were  approx.  13dB.  With  two  interfaces  to  a  PC 
(fifo-memory-like),  BER  measurements  have  been  carried  out  at  the  7  stage  system,  including 
readdressing.  The  BER  was  better  than  10'^'^. 

The  size  of  the  setup  (fig.  2,3)  is  dominated  by  the  telescope.  If  the  pitches  of  the  emitters  and 
receivers  would  be  equal,  such  a  stage  would  be  only  10mm  long.  With  a  parallel  addressable 
VCSEL  array  and  a  faster  receiver  (under  development),  a  data  rate  of  200  Mbit/(s*channel) 
seems  feasible. 


Fig.  3:  Scheme  of  the  optical  setup. 


Fig.  4:  Data  flow  in  the  experimental  setup, 
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Abstract.  Smart-pixel  technology  provides  an  efficient  method  of  imple¬ 
menting  optoelectronic  error-backpropagation  neural  networks.  Quasi-linear 
detector/modulator  circuits  are  required  to  provide  the  interface  between  op¬ 
tics  and  electronics.  A  design  to  perform  the  learning  algorithm  used  by  the 
S-SEED  network  is  simulated  using  1/im  CMOS  electronic  and  FET-SEED 
circuits.  Simulations  of  the  S-SEED  neural  network  demonstrate  the  exis¬ 
tence  of  ’windows’  of  optimal  learning  behaviour  with  respect  to  the  neuron 
contrast  ratio. 


1.  Thresholding  Neuron  Devices  and  Error  Propagation  Algorithm. 

The  development  of  electronic  artificial  neural  networks  has  opened  up  considerable 
interest  in  the  suitability  of  optics  to  provide  the  high  interconnection  bandwidth  required 
to  full  exploit  the  advantages  inherent  in  the  neural  paradigm  [1]  -  [3].  A  network  has 
been  devised  which  uses  optoelectronic  devices  in  every  section  of  the  network. 

The  devices  employed  as  the  thresholding  neurons  are  S-SEEDs  operating  as  simple 
buffers  [6].  The  S-SEED  array  is  subdivided  into  three  neural  layers,  the  input,  hidden 
and  output  layers  described  in  the  classic  multilayer  perceptron  [4].  Although  these  layers 
are  architecturally  distinct  they  are  contained  on  the  same  physical  plane.  The  algorithm 
used  to  train  the  network  is  a  modification  of  the  error  back  propagation  algorithm  [4] 
[5].  Given  N  neurons  the  amount  that  the  weight  matrix  is  altered  by  is  (z  denotes  an 
output  unit) 

outputs 

=  ni  T,  i,j  =  l..N,i4z 

hW„  =  i,j  =  l..N,z  =  z  (1) 

IZ  is  the  SEED  reflectivity,  rj  is  an  externally  set  learning  constant,  Tj  is  the  target  for 
output  unit  i  and  is  the  output  from  unit  i. 

The  output  from  the  smart-pixel  array  which  calculates  the  product  in  equation  1 
is  processed  optically  to  produce  the  sum  over  the  output  units  required  to  caculate  the 
change  for  non-output  units. 
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2.  Smart  Pixel  Electronic  Designs 

2.L  Linear  Optical  Receiver  and  Transmitter. 

A  requirement  for  the  optoelectronic  implementation  of  the  neural  algorithm  is  an  ana¬ 
logue  receiver/modulator  interface  between  the  optics  and  the  electronics.  The  receiver 
subcircuit  is  provided  by  a  clamped  FET-SEED,  with  the  SEED  held  at  the  higher  po¬ 
tential  above  the  FET,  operating  at  the  point  [6].  The  clamp  diodes  are  set  to  restrict 
the  operating  voltage  of  the  SEED  in  the  linear  region  of  the  responsivity  and  reflectivity 
of  the  SEED. 


Input  Voltage 
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Figure  1.  F-SEED  Detector  (Solid)  and  Modulator  (Dashed)  Responses  at  A 
=  860  nm:  Clamp  diodes  restrict  operation  of  the  detector  within  quasi-linear 
region  of  responsivity  curve.  Modulator  read  beam  power  =  1  mW. 


The  modulator  subcircuit  is  provided  by  a  FET-SEED  with  the  SEED  at  a  lower 
potential  than  the  FET,  again  operating  at  the  A^  point  of  the  SEED.  The  voltage  from 
the  smart-pixel  is  applied  to  the  gate  node  of  the  FET  and  the  SEED  is  read. 

2.2.  Arithmetic  Unit. 

The  accurate  calculation  of  the  change  in  weights  for  a  particular  input  pattern  is  of 
fundamental  importance  to  the  efficient  operation  of  the  network.  The  optoelectronic 
interfaces  described  previously  in  conjunction  with  CMOS  electronics  provide  an  arith¬ 
metic  unit  for  the  calculation  of  the  product, 

a.,-  =  {K{T^)^n{Of,))nOp  (2) 

The  neural  plane  output  vector  is  fanned-out  by  rows  and  then  columns  and  the  col¬ 
umn  fan-out  applied  as  read  beams  to  the  output  of  the  arithmetic  unit  to  simplify  the 
electronic  design  of  the  unit.  The  row  fan-out  is  applied  as  the  input  to  the  smart-pixel 
array  along  with  the  row  fan-out  of  the  target  vector.  The  arithmetic  unit  consists  of 
a  differential  amplifier,  a  common-source  amplifier  and  two  level  shifters.  The  circuit 
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was  simulated  using  the  Spice3  package  for  the  Eurochip2  CMOS  process  and  the  results 
show  the  expected  output  from  the  modulator  stage  of  the  smart-pixel  (Figure  2). 


P BASE  +  ^(Oj)  P READ 


Figure  2.  Smart-pixel  arithmetic  unit  output  response.  Pbase  —  0.9mkF; 
Pread  =  0.1m tF.  Bias  power  is  added  to  the  modulator  read  beams  to  ensure 
linear  operation  over  the  range  of  output  voltages  produced  by  the  electronic 
circuit. 


3.  Network  Simulations. 

A  simulation  of  the  network  was  developed  and  a  variety  of  simple  tasks  were  presented  to 
the  network.  To  date  these  consist  f  simple  Boolean  functions  (XOR,  Half  Adder),  more 
complex  Boolean  functions  (Full  Adder,  2x2  Switching  Node)  and  ID  pattern  recognition 
and  discrimination. 

The  network  was  configured  to  learn  the  XOR  function  with  two  input,  one  output 
and  seventeen  hidden  units.  The  four  input  patterns  of  the  XOR  function  were  presented 
a  random  order  to  a  network  with  an  absolute  zero  minimum  weight(Figure  3:curve  1). 
The  effect  of  a  non-zero  minimum  weight  was  considered  with  a  minimum  weight  of  0.05. 
(Figure  3:curve2) 

Each  network  configuration  shows  the  same  general  behaviour,  at  low  contrast  the 
learning  time  is  increasing  asymptotically  until  at  C  =  1  the  network  requires  infinite 
presentations  to  learn.  For  2.5  <  C  <  5.5  the  learning  time  is  minimised  at  around  10 
presetation  sets  (of  four  patterns  each)  for  both  networks  .  As  the  contrast  ratio  rises  the 
learning  time  begins  to  rise.  This  is  due  to  the  distance  the  weights  have  to  move  in  the 
multi-dimensional  weight  space.  At  low  contrast  the  distance  moved  on  any  one  wrong 
output  is  small  and  so  the  network  requires  a  large  number  of  presentations  to  approach 
the  correct  neighbourhood  in  the  weight  space.  As  the  contrast  increases  the  distance 
moved  per  cycle  increases  towards  a  point  which  most  effectively  moves  the  weights  to  a 
correct  solution.  The  increase  in  learning  time  observed  at  the  upper  end  of  the  contrast 
ratio  is  produced  by  the  weights  oscillating  around  the  correct  solution  in  the  weight 
space  until  they  come  within  the  region  which  will  produce  correct  behaviour. 
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Neuron  Contrast  Ratio  3iJ3i^ 

Figure  3.  Optoelectronic  Neural  Network  Simulation:  Learning  times  with 
respect  to  output  contrast  ratio  for  different  network  configurations. 


4.  Conclusions. 

We  have  presented  the  design  for  a  smart-pixel  which  performs  the  weight  update  calcu¬ 
lation  for  a  modified  error  back  propagation  network.  The  pixel  employs  novel  clamped 
FET-SEED  detectors  and  modulators  to  produce  analogue  input/output  stages  for  the 
analogue  electronic  circuit.  These  modulators  are  designed  to  operate  considerably  faster 
than  the  neural  plane  to  minimise  any  bottleneck  the  learning  process  could  produce. 

The  variation  of  the  learning  time  of  the  network  with  neuron  contrast  ratio  shows 
the  existence  of  ’windows’  of  reduced  learning  time  where  the  network  weights  move 
into  the  correct  region  in  the  weight  space  efficiently.  The  use  of  devices  with  non-zero 
minimum  reflectivity  in  the  weighting  unit  produces  a  marginal  increase  in  the  learning 
time  for  high  contrast  ratio  weighting  units. 
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Abstract 

Time-resolved  photoluminescence  measurements  have  been  used  to  study  temperature 
dependent  carrier  sweepout  in  biased  multiple  quantum  well  p-i-n  diodes. 


The  multiple  quantum  well  (MQW)  p-i-n  structure  is  commonly  used  in  the  design  of 
optoelectronic  devices,  such  as  detectors,  modulators  and  SEEDs,  For  these  applications, 
both  high  speed  operation,  and  low  optical  input  power  are  often  required  and  as  such  it  is 
desirable  to  know  which  factors  limit  device  performance  in  these  respects.  For  example,  the 
use  of  a  MQW  structure  places  a  fundamental  limitation  on  the  response  of  the  device,  since 
carriers  must  be  swept  out  of  the  quantum  wells  by  the  applied  field  before  they  can 
contribute  to  the  photocurrent.  The  size  of  the  device  also  has  important  implications  in  terms 
of  both  speed  and  input  power.  Large  area  devices  suffer  from  an  increased  response  time  and 
input  energy,  due  to  their  larger  capacitance,  whilst  smaller  devices  suffer  from  increased 
surface  recombination  at  the  mesa  sidewalls.  In  this  paper  we  present  a  study  of  the  excess 
carrier  dynamics  in  biased  MQW  p-i-n  diodes  using  the  technique  of  time-resolved 
photoluminescence  (TRPL). 

The  measurements  were  performed  on  a  microscope  based  instrument  (a  derivative  of 
an  Edinburgh  Instruments  Lifemap)  which  has  been  described  in  detail  elsewhere  [1  ].  Sample 
excitation  is  provided  at  a  wavelength  of  765nm  by  a  passively  Q-switched  picosecond 
AlGaAs  laser  diode  (pulse  duration  <20ps)[2].  The  actively  quenched  single  photon 
avalanche  diode  (SPAD)  detector  [3  ]  when  coupled  with  the  microscope  optical  system 
gives  a  spatial  resolution  of  <5pm  [4  ]  and  allows  TRPL  measurements  in  the  spectral  range 
780-1  l(X)nm.  The  instrument  uses  the  time-correlated  single  photon  counting  (TCSPC) 
technique  [5  ]  and  has  an  instrumental  full  width  at  half-maximum  (FWHM)  of  50-60ps.  For 
low  temperature  measurements,  the  samples  were  mounted  in  a  continuous  flow  helium 
cryostat  (Oxford  Instruments  model  CFl  104)  which  had  been  modified  to  allow  close  optical 
access  to  the  sample,  thus  maintaining  the  high  spatial  resolution  of  the  microscope  system. 

The  mechanisms  which  govern  the  photoluminescence  (PL)  decay  time  from  a  biased 
MQW  structure  can  be  divided  in  to  three  categories,  a)  recombination,  b)  2-dimensional 
lateral  diffusion  (along  the  plane  of  the  wells),  and  c)  vertical  transport  or  sweepout  through 
the  barrier  material.  Recombination  and  diffusion  will  only  be  of  significance  at  or  near  the 
zero  internal  field  condition,  and  thus  carrier  sweepout  will  be  the  dominant  PL  decay 
mechanism. 

For  deep  GaAs/AlxGai-xAs  quantum  wells  (i.e.  x  >  0.2),  there  are  two  mechanisms  by 
which  carriers  escape  in  the  presence  of  an  electric  field:  thermionic  emission  and  quantum 
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mechanical  tunnelling.  Schneider  et  al.  [6  ]  give  the  thermionic  emission  time  from  a 
quantum  well  of  width  Lw  for  an  applied  field  e  as 


where  e  or  h  refers  to  electrons  or  holes  respectively,  m  is  their  effective  mass  in  the  well  and 
T  is  the  absolute  temperature.  The  emission  rate  is  determined  by  the  barrier  height,  H(e), 
over  which  the  particles  must  be  emitted  from  a  bound  state  in  the  quantum  well,  into  the 
continuum  of  states.  H(e)  is  given  as 


,(e)  =  AE„..,-E! 


where  AEc.v  is  the  conduction  or  valence  band  offset  depending  on  whether  the  particles  are 
electrons  or  holes  and  is  the  energy  of  the  n*  electron  (hole)  subband  relative  to  the 
bottom  of  the  well. 

The  quantum  mechanical  tunnelling  rate  from  the  n*  sublevel  has  been  given 
Larsson  et  al.  [7  ]  as  the  product  of  the  barrier  collision  frequency  and  the  quantum 
mechanical  tunnelling  probability: 


2 

!2Lwm 


Lb  is  the  barrier  thickness,  nib^^  is  the  particle  effective  mass  in  the  barrier  material,  and  H'(e) 
the  effective  height  of  the  barriers  in  the  presence  of  an  electric  field  and  is  given  as 

H<..H,(e)  =  -^|e|e(L.  +  lJ  (4) 


where  the  height  of  the  tilted  barriers  has  been  approximated  by  the  average  value. 

TRPL  measurements  were  performed  on  two  different  GaAs/Alo.3Gao.7As  device 
stmetures,  the  first  consisting  of  71.5  periods  of  lOOA  wells  with  35A  barriers  and  the  second 
consisting  of  60.5  periods  of  100 A  wells  with  65A  barriers.  Figure  1  shows  a  3-dimensional 
plot  of  the  PL  decays  at  a  wavelength  of  850nm,  from  the  35A  barrier  device,  for  various 
applied  fields  between  +llkVcm’^  and  -175kVcm’\  Note  the  two  different  linear  scales  on 
the  applied  field  axis,  and  that  the  decays  have  been  normalised  to  the  time-averaged  PL 
intensity.  The  device  mesa  size  was  ~  40pmx40pm  and  the  excitation  spot  was  determined  to 
be  sufficiently  large  to  eliminate  transverse  diffusion  effects.  The  peak  photogenerated  carrier 
density  was  estimated  to  be  ~  5  x  10*^cm“^ 

Qearly,  as  the  reverse  bias  is  increased  from  the  zero  applied  field  condition,  the  rate 
that  photogenerated  carriers  are  ejected  from  the  wells  and  swept  towards  the  corresponding 
doped  regions  increases,  resulting  in  a  reduced  PL  intensity  and  decay  time.  When  a  forward 
bias  is  applied  to  the  device  this  begins  to  cancel  the  built  in  field  (of  ~-8kVcm'^)  and  thus  the 
PL  decay  becomes  longer.  To  investigate  the  mechanisms  responsible  for  field  induced  carrier 
emission  from  the  wells,  the  mean  PL  decay  time  for  each  measurement  was  determined  and 
the  results  are  plotted  as  a  function  of  applied  field  in  figure  2.  In  the  high  field  regime  (i.e. 
with  an  applied  field  <-10kVcm'^)  the  PL  decays  are  predominantly  single  exponential,  whilst 
for  lower  fields  there  is  evidence  of  trap  saturation  and  thus  the  decays  are  no  longer  single 


exponential. 

In  the  high  field  regime  defined  above,  the  field  dependence  of  the  PL  decay  time,  for 
both  device  structures,  is  exponential  over  more  than  a  decade  in  applied  field  and  two 
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Fig.  1  PL  decays  from  a  GoAs/AlGoAs  MQW  p-i-n  structure  as  a  function  of  applied  field. 
The  decays  have  been  scaled  to  the  integrated  PL  intensity. 

decades  in  decay  time.  This  exponential  dependence  over  such  a  large  range  in  applied  field 
would  seem  to  eliminate  tunnelling  as  the  mechanism,  since  equation  (3)  indicates  that  this 
should  be  non-exponential  with  field.  Assuming  the  thermionic  emission  model  of  equation 
(1)  then  the  gradient  of  the  curves  for  the  35 A  and  65 A  structures  should  be  riven  as 
eLW/2kT.  The  experimentally  observed  values  are  25.1  x  10”^cmkV^  and  315  x  IQ-^cmkV^ 
respectively  and  are  in  good  agreement  with  that  deduced  by  Cavaill^s  et  al.  [8  ]  using  a 
pump-probe  technique  on  asymmetric  SQW  waveguides.  The  PL  decay  time  for  each 
structure  appears  to  be  directly  related  to  the  number  of  wells  and  clearly  indicates  the 
importance  of  retrapping  of  carriers,  by  adjacent  wells,  during  sweepout. 


Applied  Field  (kVcm  '^) 

Fig.  2  PL  decay  time  versus  applied  field  for  the  35 A  and  65 A  barrier  p-i-n  diodes 
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Figure  3  shows  the  PL  decay  time  versus  applied  field  for  the  35A  barrier  structure,  at 
various  temperatures.  The  figure  indicates  a  general  increase  in  the  sweepout  time  with 
decreasing  temperature  consistent  with  the  ffeezout  of  thermal  effects.  At  temperatures 
<150K  a  clear  minimum  develops  in  the  PL  decay  time  at  an  applied  field  of  ~25kVcm'^  This 
would  be  consistent  with  resonant  tunnelling  between  the  heavy  hole  sub-bands  in  adjacent 
wells.  Resonant  tunnelling  of  electrons  would  be  expected  to  occur  at  an  applied  field  of 
'-50kVcm’\  (Sweepout  of  a  single  carrier  type  is  sufficient  to  terminate  the  PL  signal). 


Fig.  3  PL  decay  time  versus  applied  field  for  the  35 A  barrier  p-i-n  diode  at  various 
temperatures 

In  conclusion,  TRPL  has  been  used  to  study  the  carrier  dynamics  in  biased  MQW  p-i- 
n  structures.  In  the  high  field  limit  (ie  >10kVcm'^),  at  room  temperature,  the  sweepout 
mechanism  appears  to  be  of  thermal  origin,  however,  whether  this  is  dominated  by  electrons 
or  holes  has  not  been  determined.  At  low  temperatures,  there  is  evidence  that  resonant 
tunnelling  between  heavy  hole  sub-bands  in  adjacent  wells  is  of  importance. 

The  devices  used  in  these  experiments  were  on  SEED  array  chips,  commercially 
available  from  AT&T  Microelectronics.  The  authors  acknowledge  the  support  of  the  Royal 
Society  Paul  Instrument  Fund  and  the  UK  Science  and  Engineering  Research  Council 
(SERC).  SJF  is  supported  by  a  SERC  CASE  studentship  with  Edinburgh  Instruments  Ltd. 
The  actively  quenched  SPADs  are  used  by  agreement  of  Prof.  Sergio  Cova  and  co-workers, 
Polytechnico  di  Milano,  Italy. 
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Abstract.  We  present  a  novel  smart  pixel  structure  which  integrates  a  high 
electron  mobility  transistor  structure  within  an  asymmetric  Fabry-Perot  optical 
modulator.  There  is  no  doping  below  the  transistor  layers.  The  structure  only 
requires  standard  processing  techniques.  It  facilitates  the  design  of  high  speed,  low 
noise  smart  pixel  circuitry. 


1.  Introduction 

Monolithic  integration  of  electronic  components  with  optical  elements  provides  the  means 
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Bragg  Mirror  (undoped) 


Figure  1.  The  proposed  structure 

whereby  practical  optoelectronic  data  processing  becomes  a  reality. 


Following  recent  interest  in  smart  pixels  [1,2, 3,4]  we  have  developed  a  novel  wafer 
structure  (figure  1)  which  combines  an  AIGaAs  p-i-n  detector/modulator  structure  and  a 
quantum-well  high  electron  mobility  transistor  (QW-HEMT)  structure. 
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2.  The  problem  of  backgating 

This  design  avoids  the  presence  of  a  p-n  junction  under  the  QW-HEMT  which  would  otherwise 
result  in  backgating  [1].  Backgating  reduces  the  output  optical  contrast  ratio  of  a  modulator 
drive  circuit  and  gives  rise  to  a  gradual  switching  action.  It  arises  when  a  continuous  p-layer 
under  more  than  one  n-channel  device  is  held  at  a  fixed  voltage.  Figure  2  illustrates  this 
problem. 


Light 


Modulator  or 
photodiode 


Figure  2.  Illustration  of  backgating 


The  FET  layers  form  the  n-layers  of  the  p-i(MQW)-n  modulator  structure.  We  assume 
the  p  layer  is  held  at  a  constant  bias  and  that  the  n-layer  potential  beneath  point  'A'  varies 
through  the  circuit  action.  When  the  voltage  at  'A'  rises,  the  diode  structure  beneath  the  FETs  is 
reverse  biassed.  The  depletion  region  associated  with  this  diode  extends  upwards.  The  electron 
density  in  the  conducting  channel  falls  and  so  the  circuit  operation  is  interfered  with. 


More  positive  gate  voltage 


Figure  3,  QW-HEMT  schematic  band  diagrams 


3.  Structure  description 

Our  solution  to  backgating  is  a  modified  asymmetric  Fabry-Perot  modulator  (AFPM)  allowing 
high  contrast  reflection  modulation  and  efficient  photodetection  (figure  1).  The  QW-HEMT 
layers  are  situated  inside  the  AFPM  cavity  on  top  of  an  undoped  back  Bragg  mirror  (DBR).  The 
QW-HEMT  contacts  the  underside  of  the  quantum  well  region  directly.  A  deep  wet  etch 
uncovers  the  QW-HEMT  surface  for  processing.  This  leaves  the  p-i-n  detectors  and  modulators 
mcs3.s 

Two  schematic  conduction  band  diagrams  of  a  QW-HEMT  are  shown  in  figure  3.  The 
gate  metal  forms  a  Schottky  barrier  to  the  AlGaAs.  Providing  that  the  upper  AlGaAs  layer  is 
thin  enough,  the  density  of  electrons  on  the  GaAs  side  of  the  AlGaAs-GaAs  heterojuction  is 
controlled  by  the  voltage  applied  to  the  gate. 


4.  Results 

Figure  4  shows  the  Fabry-Perot  resonance  wavelength  map  of  a  2  inch  wafer  of  our  epitaxial 
structure.  Our  required  resonance  wavelength  (operating  wavelength)  is  860nm.The  wafer  was 
grown  by  MOCVD.  _ _ 


Left 

Figure  4.  2  inch  wafer  resonance  wavelength  map  of  complete  structure 


In  our  MOCVD  grown  samples  the  DBRs  suffered  from  p-type  impurities;  this  prevented 
transistor  operation.  We  are  now  investigating  MBE  grown  samples. 


5.  Discussion 

We  can  see  from  the  wafer  map  that  the  resonance  position  varies  over  the  wafer.  This  certainly 
poses  an  obstacle  to  the  use  of  AFPMs  in  commercially  viable  systems.  Some  solutions  have 
been  proposed  by  other  groups  [5,6].  The  structure  is  flexible  in  that  it  permits  etching  of  the 
front  surface  of  the  modulator  in  order  to  adjust  its  optical  resonance  wavelength.  Alternatively 
it  would  be  possible  to  deposit  an  additional  front  mirror  to  increase  the  contrast  ratio  of  the 
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output  modulation.  For  the  moment  we  may  circumvent  the  problem  by  anti-reflection  coating 
the  modulator  face. 

The  solution  to  backgating  used  in  FET-SEED  systems  involves  the  insulation  of 
selected  areas  of  the  offending  p-layer  with  proton  bombardment.  The  conducting  islands  that 
remain  beneath  the  FETs  are  contacted  with  Be  implantation  [3].  We  ex|^ct  circuit  operation  to 
be  faster  using  our  proposed  design  as  it  avoids  the  parasitic  drain-source  capacitance 
associated  with  the  presence  of  biassed  conducting  material  beneath  FETs  [7].  Our  solution 
requires  the  use  of  the  less  complex  processing  technique  of  etching.  Although  this  leaves  us 
with  a  non-planar  surface,  optical  lithography  is  still  possible. 

Electrons  in  HEMTs  are  spatially  separated  from  their  donors  and  move  in  an  undoped 
channel  so  they  experience  less  scattering.  The  HEMT  is  therefore  a  low  noise  device  rendering 
it  suitable  for  low  noise,  high  speed  circuitry. 

Our  proposal  also  facilitates  the  optimisation  of  the  high  speed  performance  of  the 
AFPM  [8].  This,  married  with  the  high  bandwidth  offered  by  the  QW-HEMT,  holds  great 
potential  for  high  speed  smart  pixels. 


6.  Conclusion 

We  have  presented  an  alternative  method  of  integrating  FETs  with  multiple  quantum  well 
modulators.  We  suggest  the  use  of  a  QW-HEMT  because  of  its  low  noise  and  high  bandwidth. 
We  minimise  FET  parasitic  drain-source  capacitance  by  avoiding  the  presence  of  contacted 
conducting  material  beneath  the  transistor  layers.  We  expect  this  will  allow  faster  circuit 
operation  [7]. 

The  wafer  processing  required  is  standard;  the  mesa  modulators  and  detectors  are 
compatible  with  optical  lithography.  The  mesa  modulator  approach  has  already  been  used  to 
optimise  AFPM  bandwidth  [8].  Therefore  we  can  envisage  high  speed  analogue  smart  pixel 
systems  based  on  this  approach. 
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We  demonstrate  a  novel  all-optical  Quantum  Confined  Stark  modulator.  The  necessary 
electric  field  is  switched-on  optically  across  every  period  of  the  heterostructure,  by  exploiting 
strongly  asymmetric  photocarrier  transfer  in  GaAs/AlAs  layers.  In  an  unoptimized  sample, 
we  measured  at  room  temperature  9  meV  exciton  redshift  at  860nm  induct  by  optical 
excitation  of  about  50  W/cm^.  The  origin  of  an  unwanted  saturation  mechanism  limiting  the 
modulator  performance  is  also  discussed. 


We  present  preliminary  results  on  a  contact-free  quantum  well  (QW)  light  modulator 
operated  all-optically,  in  the  sense  that  one  "write"  laser  beam  strongly  affects  the  optical 
constants  in  the  vicinity  of  a  QW  exciton  line  and  thus  the  output  of  a  resonant  "read"  beam. 
Previous  work  on  all-optical  QW  modulators  has  focused  on  n-i-p-i  heterostructures  [1].  The 
main  advantage  of  the  present  design  over  the  n-i-p-i's  is  that  the  estimated  device  response 
times  are  shorter  by  more  than  three  orders  of  magnitude  [2],  allowing  MHz  modulation 
rates.  The  principle  of  operation  of  the  device  is  as  follows:  every  period  of  the 
heterostructure  contains  three  QWs  designed  in  such  a  way  that  following  above  bandgap 
photoexcitation  a  large  fraction  of  the  photogenerated  electrons  and  holes  tend  to  separate 
and  accumulate  in  the  exterior  QWs,  creating  a  local  space-charge  field  having  its  maximum 
in  the  region  in  between  and  acting  via  the  Quantum  Confined  Stark  Effect  (QCSE)  [3]  on  the 
exciton  resonance  of  the  central  QW. 

In  Figure  1,  we  show  a  schematic  band  diagram  of  one  period  of  the  heterostructure. 
The  three  QWs  are  enumerated  from  left  to  right.  QWj  is  25 A  of  GaAs,  surrounded  by  a 
graded  AlGaAs  (x=36-42%,  800 A)  and  an  AlAs  barrier  (lOOA).  The  combination  of  a  narrow 
GaAs  QW  adjacent  to  an  AlAs  layer  is  the  key  aspect  of  the  device.  Following  optical 
excitation  with  a  photon  energy  larger  than  the  energy  gap  of  the  AlGaAs  graded  barrier  or 
QWi  (X^te<620nm  or  TOOnm,  respectively,  at  T=300K),  this  layer  combination  functions 
as  a  one-way  "quantum  filter"  for  the  photocarriers  collected  in  QWj,  by  blocking  the  holes 
out  but  allowing  the  electrons  transfer  rapidly  into  QW2.  This  is  due  to  the  fact  that,  for 
sufficiently  thin  GaAs  layer  thicknesses,  the  electron  energy  level  at  the  F-point  in  the  GaAs 
QW  is  higher  than  the  one  at  the  X-point  in  the  AlAs  layer.  F-X  electron  transfer  times  on 
the  picosecond  time  scale  have  been  measured  in  thin  GaAs/ AlAs  superlattices  [4].  QW2  is 
150A  of  GaAs  and  QW3  for  the  purposes  of  this  report  can  be  viewed  as  80A  of  InGaAs 
(x=10%).  In  actuality,  QW3  consists  of  70A  of  GaAs  in  which  2  monolayers  of  InAs  islands 
are  epitaxially  inserted.  The  reason  for  this  is  that  one  projected  application  for  this  modulator 
is  all-optical  photorefractivity.  The  use  of  InAs  islands  is  meant  to  eliminate  lateral  diffusion 
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of  critical  importance  for  such  devices[5].  The  electrons  that  pass  through  the  quantum  filter 
leaving  behind  holes,  are  able  to  subsequently  tunnel  through  a  thin  (<50A)  AlGaAs  barrier 
into  QW3.  The  resulting  space-charge  electric  field  acts  upon  the  excitonic  resonance  of 
QW2  (Xread^  ^60nm). 

AlAs 


AIGqAs  rr~^ 


GaAs 


InGaAs 


Figure  1:  Schematic  band  diagram  of  one  period  oi  h^rostmctuie  and  photo^i^atkm  of  electric  field 
r^id  elation  transfer  from  QWj  to  QW3.  Tte  QW  energy  levels  arc  denoted  by  full  lines  whereas  the  X- 
point  energy  level  in  the  AJAs  layer  by  a  dotted  line. 


850  855  860  865  870 


Wavelength  (nm) 


Figure  2:  Rolshifl  of  the  QW2  exdton  feature  observed  in  room  temperature  transmission  spectra,  with 
(Argon)  or  without  (Dark)  photc^xcitation  by  20W/cm^  of  Argon  laser  blue  lines. 
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We  can  directiy  monitor  the  photogenerated  electric  field  across  QW2  by  measuring 
the  redshift  of  the  exciton  line  induct  by  a  coincident  laser  source.  In  Figure  2,  we  show  the 
room  temperature  transmission  spectra  of  one  of  our  samples  in  the  region  of  the  QW2 
exciton,  with  and  without  photoexcitation  by  20W/cm2  of  the  Argon  laser  blue  lines.  The 
transmission  exciton  feature  contains  contributions  from  both  heavy  and  light  hole  excitons, 
accounting  for  the  asymmetric  profile  on  the  high  energy  side.  The  whole  exciton  feature 
clearly  redshifts  under  the  effect  of  illumination.  Redshifts  up  to  9  meV  at  room  temperature 
were  registered  with  about  50  W/cm^  of  optical  excitation.  Based  on  available  data  on  the 
QCSE-induced  redshifts  as  a  function  of  electric  field  for  similar  QW  widths  [6],  we  estimate 
that  the  9  meV  redshift  corresponds  to  an  effective  electric  field  of  nearly  30  kV/cm.  Further 
increasing  the  incident  laser  intensity  does  not  significantly  increase  the  photogenerated 
electric  field,  indicating  that  the  optictd  charging  process  is  subject  to  a  saturation  mechanism 
which  we  will  discuss  next. 

In  Figure  3,  we  plot  the  QW2  exciton  redshift  obtained  in  photoluminescence  (PL) 
experiments  excited  by  the  647  nm  Krypton  laser  line  [7],  as  a  function  of  incident  laser 
power  density  for  various  temperatures.  The  main  observation  in  this  figure  is  that  the  exciton 
redshift  saturates  with  increasing  power  densities.  For  T=150K,  for  instance,  the  redshift 
saturates  at  a  value  corresponding  to  a  photogenerated  electric  field  of  about  30kV/cm.  For 
temperatures  below  150K  the  curves  (not  shown  here)  overlap  with  the  one  for  T=150K.  On 
the  other  hand,  as  the  temperature  increases  above  15  OK  the  onset  of  the  saturation  occurs  at 
gradually  smaller  exciton  redshifts,  i.e.  electric  fields.  The  curve  denoted  by  ”1/20”  is  under 
the  same  conditions  as  the  curve  for  T=150K  but  with  a  reduced  duty  cycle  by  a  factor  of  20. 
The  fact  that  the  two  curves  practically  overlap  excludes  any  scenario  of  heating.  Finally,  the 
curve  ”780nm”  verifies  that  excitation  by  a  laser  diode  with  ^=780  nm  does  not  generate 
electric  field  since  it  is  only  absorbed  in  QW2  and  QW3. 


Figure  3:  Exciton  redshift  of  QW2  as  a  function  of  incident  laser  power  density  (X==647  nm)  for  various 
temperatures.  The  curve  "1/20’*  is  obtained  under  the  same  conditions  as  the  one  for  T=150K  but  with  a 
reduced  duty  cycle  by  a  factor  of  20.  The  curve  "780nm"  is  obtained  at  T=150K  with  an  excitation  wavelength 
of  780  nm. 


526 


In  order  to  optimize  the  modulator  performance,  it  is  important  to  understand  the 
saturation  mechanism.  In  fact,  based  on  simple  modelling  calculations  neglecting  any 
saturation  mechanism,  we  estimate  that  an  incident  laser  power  density  of  only  a  few  W/cm^ 
is  sufficient  to  create  an  electric  field  of  about  50kV/cm.  In  other  words,  comparing  our 
experiment  with  the  calculation,  we  conclude  that  the  optical  charging  mechanism  would  be 
at  least  an  order  of  magnitude  more  efficient  if  the  saturation  mechanism  was  absent. 

We  conader  several  possibilities  for  the  saturation  mechanism,  such  as:  (i)  quenching 
of  the  F-X  transfer  process  from  QWj  to  QW2  due  to  electric  field-induced  band  bending, 
(ii)  quenching  of  the  electron  tunneling  ^ep  from  QW2  to  QW3  for  the  same  reason,  and  (iii) 
the  back-transfer  of  electrons  from  QW3  to  QW2  becomes  possible  at  an  elevate  el^ric 
field.  First,  we  note  that  scenarios  (i)  and  (ii)  are  temperature  independent,  therefore  they  can 
not  account  for  the  temperature  behavior  of  the  saturation  mechanism.  In  addition,  we  can 
exclude  scenario  (i)  based  on  time-resolved  PL-decay  experiments  [8]  which  indicate  that  the 
F-X  transfer  time  remains  smaller  than  20psec  (our  time  resolution  in  this  case)  even  for 
photogenerated  electric  fields  higher  than  100kV^m.  Hence,  the  back-transfer  of  electrons 
from  QW3  to  QW2  afl;er  some  value  of  the  electric  field,  is  most  likdy  to  be  responsible  for 
the  saturation  mechanism.  In  fact,  we  estimate  that  the  electron  density  we  can  store  in  the 
lowest  level  of  the  InAs  islands  is  only  8x10^®  cm"^.  In  order  to  create  an  electric  field  of 
30kV/cm  we  need  electron  dendties  of  about  2x10^1  cm“2.  Therefore,  occupation  of  the 
excited  InAs  electron  levels  must  take  place.  The  saturation  mechanism  may  easily  arise  from 
an  electric  field-induced  alignment  of  the  lowest  n=l  electron  level  of  QW2  with  the  first 
excited  electron  state  of  the  InAs  islands,  allowing  back  transfe*  of  electrons  from  QW3  to 
QW2.  The  zero  field  energy  difference  between  the  two  levels  is  estimated  to  be  40-50  meV. 
For  a  spatial  separation  of  the  electronic  wavefunctions  of  120-1 5 OA  in  the  samples  studied, 
such  an  alignment  is  possible  for  electric  fields  in  the  range  of  25-40kV/cm.  This  is  quite 
commensurate  with  our  experimental  finding  that  the  electric  field  saturates  at  30kV/cm. 
Also,  in  this  picture,  the  temi>erature  behavior  can  be  understood  in  a  natural  way  since  it 
involves  two  thermally  distributed  el^ron  populations  brought  in  resonance.  Further 
investigations  are  under  way  in  order  to  clarify  this  important  issue. 

In  summary,  we  present  a  novel  heterostructure  device  to  be  used  as  an  all-optical 
quantum  well  modulator.  The  modulation  principle  is  based  on  the  Quantum  Confined  Stark 
Effect  but  needs  no  electrical  contacts  since  it  lies  on  the  property  that  an  incident  laser  is  able 
to  switch-on  an  electric  field  in  every  period  of  the  heterostructure.  In  early  experiments,  we 
demonstrate  photogenerated  electric  fields  up  to  30kV/cm  with  moderate  laser  intensities. 
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AlGaAs  and  QWj  are  excited,  whereas  in  the  latter  case  only  QWj  contributes  to  the  optical  charging 
process. 

[8]  N.  T.  Pelekanos,  U.  Strauss,  W.  W.  Riihle,  unpublish^. 
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Abstract.  Wafer  level  optical  and  electronic  testing  of  smart  pixel  arrays  is  a  must  if  a  high  yield 
technology  is  to  be  developed.  Our  results  on  GaAs  FET-SEED  based  switching  nodes  show  the 
uniformity  and  performance  control  levels  achievable  with  today's  technology,  as  well  as  its  limits. 


The  fabrication  of  optoelectronic  components  must  be  a  high  yield  process  in  order  to 
accomplish  large  scale  manufacturing  of  smart  pixel  arrays  for  applications  in  optical 
interconnects,  photonic  switching  [1],  spatial  light  modulators  [2]  and  optical  computing.  Smart 
pixels  consist  of  optical  input  and  output  devices,  integrated  with  increasingly  sophisticated 
circuits  for  gain  and  logic  functions  [3].  For  example,  in  a  recently  demonstrated  five  stage 
photonic  switching  system  [4,5],  each  stage  consisted  of  an  array  of  4x4  routing  nodes  (pixels), 
made  using  the  field-effect-transistor  self  electro-optic  effect  device  (FET-SEED)  technology 
[6,7].  Each  node  contained  25  transistors  and  17  quantum  well  diodes;  each  stage  (or  chip)  with 
16  such  nodes  had  400  transistors  and  272  diodes.  Although  simple  FET-SEED  circuits  have  been 
shown  to  switch  in  times  as  short  as  200ps  [8],  and  fully  functional  individual  nodes  operating  at 
400Mb/s  have  been  demonstrated  [4,5],  the  speed  of  an  entire  array  (or  stage)  was  only 
200Mb/s,  and  that  of  the  entire  system  was  155Mb/s,  limited  by  the  non-uniformity  of  the 
individual  nodes  and/or  arrays.  To  insure  uniformity  of  the  devices  and  chips,  wafer  level  electrical 
and  optical  testing  is  necessary.  This  paper  presents  some  of  our  methods  of  mapping  the 
performance  of  smart  pixel  arrays  at  the  wafer  level,  as  well  as  our  results. 

The  FET-SEED  technology  [6,7]  is  based  on  GaAs/AlGaAs  quantum  well  diodes  and  field- 
effect-transistors.  The  material  grown  by  molecular  beam  epitaxy  (MBE)  undergoes  about  ten 
major  processing  steps,  including:  photolithography,  etching  and  ion  implantation  for  device 
definition  and  insulation,  metal  deposition  for  ohmic  contacts,  Schottky  gates  and  electrical 
interconnections,  and  deposition  of  insulator  layers  for  electrical  insulation  and  anti-reflection 
coating.  Planar  process  technology  [7]  allows  for  batch  fabrication  of  arrays  of  complicated 
circuits,  such  as  the  2x1  switching  node  schematically  described  in  Fig.l .  The  node  consists  of  an 
optical  receiver,  an  inverter,  a  control  memory  and  a  multiplexer/driver/transmitter  [4].  The 
electrical  output  A  of  the  receiver  becomes  one  of  the  inputs  to  the  2x1  multiplexer/driver  located 
within  the  same  node.  The  other  input,  B,  comes  from  a  receiver  located  in  another  node. 
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Legend: 


S 


Detector  Diode 


7^  Schottky  Diode 


Single  gate  FET 
Dual  gate  FET 


Indicator 

circuit 


Fig.l.  Schematic  diagram  of  a  single  node.  Notice  the  four  "indicator  lights"  placed  at  the  bottom  left  (the  A  and 
A),  at  the  center  (Q),  and  the  top  right  (Q). 

Each  mux/driver  also  has  a  pair  of  complementary  electrical  inputs  (Q  and  Q  in  Fig.l), 
whose  role  is  to  control  which  one  of  the  inputs  A  or  B  is  regenerated  as  the  optical  output  C. 
The  control  memory  (set-reset  latch)  stores  this  control  bit.  An  electrical  control  signal,  V^e, 
common  to  all  the  nodes  in  the  array,  is  held  either  high,  to  enable  writing  of  the  memories  with 
the  control  bits  preceding  the  data  bits,  or  low,  to  preserve  the  memory  during  the  time  the  data 
are  processed  [4]. 

The  circuits  contain  quantum  well  p-i-n  diodes  for  detectors  and  modulators,  Schottky 
clamping  diodes,  single-gate  and  double-gate  field-effect  transistors.  Single  diodes,  transistors, 
resistors  and  conductors  are  fabricated  adjacent  to  each  array  of  smart  pixels,  for  individual 
component  characterization.  By  measuring  their  characteristics  and  mapping  them  over  the  area  of 
the  wafer  we  obtain  the  yields  of  individual  devices,  as  well  as  the  degree  of  uniformity.  These 
yields  are  in  general  very  high,  of  the  order  of  90%  and  better. 

An  example  of  data  mapping  on  a  3"  wafer  is  shown  in  Fig.2(a),  for  threshold  voltages 
of  FET's  with  lOpm  long  gates.  The  standard  deviation  of  this  distribution  is  AV=100mV  as 
shown  in  Fig. 2(b).  For  2"  diameter  wafers  the  distribution  of  narrower:  AV  is  40-70mV, 
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Fig.2.  (a)  Mapping  of  threshold  voltages  on  a  3”  wafer,  (b)  Corresponding  statistical  charts. 

depending  on  the  particular  choice  of  layer  and  doping  structure.  The  pattern  in  Fig.2(a)  is 
circular,  which  indicates  that  the  variation  is  likely  due  to  thickness  and/or  doping  variation  in  the 
MBE  growth,  rather  than  to  the  numerous  subsequent  processing  steps. 

When  packed  together  into  large  circuits  and  arrays  of  circuits,  the  performance  and 
uniformity  of  the  individual  components  may  suffer.  Therefore  it  is  necessary  to  also  map  the 
performance  of  the  nodes.  Testing  the  functionality  of  these  nodes  is  a  challenging  task,  as  it 
must  address  both  optical  and  electronic  functions. 

A  simple  optical  test  can  be  done  using  the  light  emitting  properties  of  GaAs  diodes. 
Connected  in  series  with  a  FET,  a  light  emitting  diode  (LED)  will  "light  up"  if  the  FET  becomes 
conducting.  Such  a  FET-LED  pair  thus  acts  as  an  "indicator  circuit",  sensitive  to  the  voltage  on 
the  gate  of  the  FET.  By  inserting  indicators  in  critical  points  of  the  nodes,  as  illustrated  in  Fig.l, 
they  will  light-up  if  the  voltages  tested  have  the  correct  values.  _ 

The  implementation  of  this  technique  for  monitoring  the  receiver  (A),  inverter  (A)  and 
control  memory  (Q,  Q)  for  an  entire  array  of  nodes  is  schematically  described  in  Fig. 3 (a).  For 
testing,  a  periodic  sequence  of  high  and  low  A  signals  is  simulated  electrically  and  coincides  in 
time  with  a  periodic  sequence  of  high  and  low  V^e  signals.^ince  only  the  "high  Vce"  allows 
changing  the  state  of  the  memory,  the  rate  at  which  the  Q  and  Q  indicators  light-up  is  half  the  rate 
at  which  the  A  and  A  indicators  light-up. 

An  infrared  camera  and  a  VCR  are  used  to  record  the  indicator  lights  turning  on  and  off.  The 
"motion  picture",  a  frame  of  which  is  illustrated  in  Fig. 3(b),  contains  the  images  of  all  the 
indicators  for  every  array  on  the  wafer.  Images  like  these,  as  well  as  those  of  especially  designed 
"yield  tester  arrays"  (consisting  of  indicator  circuits  only)  allowed  us  to  assess  the  uniformity  of 
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Fig. 3.  (a)  Schematic  description  of  the  optical  functional  test  of  a  4x4  array  of  switching  nodes.  Tl^  first  cell 
illustrates  the  four  "indicators”  which  are  monitored:  signal  A  from  the  receiver;  the  signal  A  from  the 

inverter;  the  two  control  signals  from  the  memory,  Q  and  Q.  (b)  The  actual  image  of  an  array.  In  each  of  the  A  and 
Q  pairs,  one  of  the  indicators  is  bright  and  the  other  is  dim. 

the  arrays  and  their  yield.  Depending  on  the  type  of  array,  the  functional  yields  vary  from  5%  to 
50%.  Note  that  the  yields  of  the  individual  devices  were  larger.  Our  preliminary  analysis  of 
failures  indicates  that  the  vast  majority  of  them  are  metal  deposition  errors,  affecting  especially  the 
FET  gates.  These  errors  are  more  severe  when  many  FET's  are  fabricated  close  together. 
Eliminating  these  errors  requires  improving  the  lift-off  process  used  in  gate  fabrication,  or 
replacing  it  with  subtractive  patterning.  Using  projection  photolithography  instead  of  the  current 
contact  photolithography  would  also  help  to  reduce  the  number  of  metal  deposition  defects. 

In  conclusion,  wafer  level  electrical  and  optical  testing  of  FET- SEED  smart  pixel  arrays 
shows  the  uniformity  and  performance  levels  achievable  with  today's  technology.  The  high  yield 
fabrication  of  densely  packed  FET  based  GaAs  circuits  appears  to  be  its  most  challenging  task. 
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Abstract.  A  layer  structure,  grown  in  a  single  step,  and  a  fabrication  process  were  developed  for 
the  monolithic  integration  of  AlGaAs/GaAs  optoelectronic  smart  pixels.  Metal  semiconductor 
field-effect  transistors  (MESFETs),  light-emitting  diodes  (LEDs)  and  photodiodes  (PDs)  were 
designed  in  such  a  way  that  the  light  emitted  from  the  LEDs  can  efficiently  be  detected  by  the 
PDs.  An  example  is  presented  of  a  fabricated  optoelectronic  smart  pixel:  a  threshold  circuit 
consisting  of  a  dual-photodiode  differential  input,  an  inverter  and  a  current-balanced  output 
containing  an  LED.  The  circuit  shows  a  switching  energy  of  2  pJ.  The  minimum  switching  power 
is  <  1  nW  with  a  contrast  ratio  >1000.  The  maximum  light  output  of  the  LED  is  18  |iW  with  an 
overall  power  dissipation  of  20  mW. 

Monolithically  integrated  optoelectronic  "smart  pixels"  are  of  current  interest  in  the  field  of 
parallel  optical  interconnects,  early  vision  processing  and  optoelectronic  neural  networks 
[1,2],  We  have  developed  a  fabrication  process  where,  in  a  flexible  way,  different  kinds  of 
optoelectronic  smart  pixels  can  be  realized  in  the  AlGaAs/GaAs  material  system  [3].  The 
material  is  grown  in  a  single  step  by  metal  organic  chemical  vapour  deposition  (MOCVD). 
On  an  n+-doped  GaAs  substrate,  we  first  grow  a  stairstep  single  quantum  well  LED  which 
ends  with  a  400  nm  thick  p+-contact  layer  (Fig.  1).  The  PD/MESFET  layers  are  grown  on 
top  of  this  structure.  They  consist  of  a  1  qm  thick  GaAs  p‘-buffer/absorber  layer,  a  200  nm 
thick  MO^^  cm“3  n-doped  GaAs  channel,  a  10  nm  thick  etch-stop  layer  of  Alo.3Gao.7As 
and  a  GaAs  n+-contact  layer. 

The  p'n-junction  between  the  buffer/absorber  layer  and  the  channel  of  the  MESFET  is 
used  as  a  PD. 


LED  MESFET  PD 


Fig.  1.  Schematic  layer  sequence  of  an  optoelectronic  smart  pixel. 
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The  wavelength  of  the  light  emitted  by  the  LED  is  set  to  790  nm  by  the  aluminium  content 
and  the  thickness  of  the  quantum  well.  ILerefore  the  emitted  light  can  efficiently  be  detected 
by  the  PD.  The  etch- stop  layer  ensures  a  homogenous  MESFET  threshold  voltage  over  the 
wafer.  The  entire  n-p-n  layer  structure  is  described  in  [4] . 

The  fabrication  process  requires  eight  photolithographic  steps  and  is  based  on  mesa 
isolation.  The  backside  ohmic  contact  is  formed  first  by  a  Ge-AuNiAu  metallisation.  This  is 
the  common  n-contact  to  all  LEDs  on  the  wafer.  For  the  n-side  ohmic  contacts  of  the 
MESFETs  and  the  PDs,  a  NiGe-AuNiAuPt  [5]  metallisation  is  evaporated  and  annealed  at 
440  C°.  Then  the  individual  MESFETs  and  PDs  are  separated  by  magnetron  enhanced 
reactive  ion  etching  (MIE)  of  mesas.  After  the  deposition  of  a  Si3N4  dielectric  film, 
windows  to  the  ohmic  contacts  of  the  MESFETs  and  PDs  (n-terminals)  as  well  of  the  LEDs 
and  the  PDs  (p-terminals)  are  opened  by  reactive  ion  etching  (RIE).  The  gate  areas  are  then 
opened  by  RIE.  The  dielectric  film  is  used  as  a  mask  for  a  selective,  recessed  wet-etch  of  the 
gate  area  [6].  This  etch  removes  the  n+-contact  layer  and  undercuts  the  dielectric  film  by 
approximately  200  nm.  The  next  metallisation  (TiPtAu)  forms  the  Schottky  gate  contacts  for 
the  MESFETs.  This  metallisation  is  also  used  as  the  p-contact  to  the  LEDs  and  the  PDs  and 
as  a  first  wiring  level.  The  LEDs  are  isolated  by  a  second  level  of  MIE  mesa  etching.  The 
main  wiring  level  (TiAl)  is  deposited  onto  the  second  dielectric  film  and  patterned  by  wet¬ 
etching. 


Fig.  2.  Left:  the  drain  I-V  characteristic  of  a  MESFET  with  a  gate  length  of  1.5  pm  and  an 
effective  width  of  80  pm.  Right:  the  quantum  efficiency  and  responsivity  v^. 
wavelength  for  a  PD  with  an  active  area  of 50-50  pm?. 


The  fabricated  individual  devices  are  then  characterised  (Fig.  2).  The  MESFET  shows 
a  maximum  transconductance  of  80  mS/mm  and  a  cun'ent  density  Ids  226  mA/mm  for  a 
gate  length  of  1.5  pm.  The  quantum  efficiency  of  the  PD  is  0.85  at  the  LED  peak  emission 
wavelength  of  790  nm.  An  efficiency  of  0.008  W/A  is  measured  for  a  25-25  pm^  LED. 

In  this  layer  design,  the  devices  are  connected  to  parasitic  elements  in  three  different 
ways.  1)  The  MESFET  is  connected  to  the  photodiode  below  it  which  is  in  turn  connected  to 
an  LED.  A  potential  between  the  MESFET  source  and  the  p-contact  of  the  photodiode  below 
causes  a  backside  channel  depletion  which  results  in  a  threshold  voltage  shift.  2)  The  p- 
contact  of  the  photodiode  is  connected  to  an  LED  below,  introducing  an  additional 
capacitance.  3)  All  LEDs  share  a  common  n-side  contact.  These  facts  must  be  considered  in 
the  circuit  design. 
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The  fabricated  optoelectronic  smart  pixels  are  threshold  circuits.  The  example  in  Fig.  3 
consists  of  a  dual-photodiode  differential  input  connected  to  an  inverter  (Ml,  M2),  which 
switches  the  LED  driver  (M3,  M4,  LED).  The  LED  is  turned  on  when  the  input  power  of  the 
switching  beam  (Pswitch)  exceeds  that  of  the  reference  beam  (PRef).  The  switching  threshold 
can  thus  be  controlled  by  the  reference  beam.  The  LED  driver  is  current-balanced:  if  the  LED 
is  off,  the  current  flows  through  MESFET  M3,  and  the  LED  is  turned  on  by  pinching  off  M3. 


Fig.  3.  Diagram  of  the  measured  threshold  circuit. 


The  dc-behaviour  of  this  threshold  circuit  is  shown  in  Fig.  4.  The  output  power  is  18 
|LiW  in  the  ON  state,  and  a  contrast  ratio  greater  than  1000  has  been  estimated.  The  minimum 
switching  power  is  measured  to  be  less  than  1  nW.  At  an  input  power  level  of  200  nW,  the 
circuit  has  a  switching  delay  of  8.4  fisec,  which  corresponds  to  a  switching  energy  of  1.7  pJ. 
The  circuit  occupies  200-300  |im^  with  a  photodetector  area  of  twice  50-50  (im^.  The  power 
dissipation  remains  constant  at  20  mW. 


0^ -  I  ■  - - ^ ^ - 1 

90.0  90.5  91.0  91.5  92.0 

light  input  power  [nW] 


Fig.  4.  The  measured  optical  output  power  v.y.  optical  input  power  of  the  threshold  circuit. 
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At  the  present  switching  energy  level  of  approximately  2  pj,  a  maximum  switching 
speed  of  4  MHz  is  achievable  if  the  entire  18  fiW  output  power  of  our  existing  smart  pixel  is 
available  at  the  detector.  The  12-12  (im^  LED  shows  no  thermal  rollover  in  the  output  power 
up  to  a  drive  current  of  5  mA.  A  PD  which  is  optically  coupled  to  such  an  LED  can  have  the 
same  size.  The  switching  energy,  which  scales  with  the  PD  area,  would  then  be  reduced 
below  100  fJ,  implying  an  expected  switching  speed  of  80  MHz  for  perfectly- coupling.  In  a 
realistic  application  however,  the  coupling  is  degraded  by  the  broad  emission  cone  of  the 
LED  and  the  limited  aperture  of  the  coupling  optics.  A  Lambertian  emission  profile  and  a 
NA  of  0.2  implies  a  coupling  efficiency  of  1%  and  hence  a  maximum  switching  speed  of  0.8 
MHz.  Higher  speeds  require  optics  with  a  higher  NA  or  a  reduction  in  the  divergence  of  the 
light  source,  i.e.  by  the  integration  of  a  VCSEL. 

In  conclusion,  we  have  developed  a  layer  structure  grown  in  a  single  step  and  a 
fabrication  process  for  the  monolithic  integration  of  optoelectronic  smart  pixels.  A  threshold 
circuits  with  a  low  switching  power  and  switching  energy  have  been  demonstrated.  The 
performance  of  such  circuits  is  suitable  for  application  in  the  fields  of  optoelectronic  neural 
networks  and  parallel  image  processing.  The  switching  speed  expected  in  a  system 
application  is  limited  mainly  by  the  efficiency  and  divergence  of  the  integrated  light  source. 

We  respectfully  acknowledge  the  design  assistance  of  K.  Engelhardt  and  the 
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with  J.E.  Epler,  K.H.  Gulden  and  M.  Moser.  We  are  also  grateful  for  the  support  of 
Professors  W.  Bachtold,  W.  Kiindig  and  J.Mlynek. 
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Abstract. 

This  paper  describes  our  recent  liquid  crystal  on  silicon  spatial  light  modula¬ 
tors.  We  present  the  most  recent  results  from  our  256  by  256  binary  reflection 
mode  SLM  operating  in  an  optical  correlator  and  a  128  by  128  analog  SLM 
that  uses  the  Distorted  Helix  Ferroelectric  effect.  The  provisional  perfor¬ 
mance  specifications  of  the  analog  SLM  are  a  frame  rate  of  ~  2.5kHz,  a  zero 
order  contrast  of  30:1,  and  >  16  grey  levels. 


1.  Introduction 

Liquid  crystal  on  silicon  spatial  light  modulators  are  made  by  placing  a  thin  layer  of 
liquid  crystal  directly  on  top  of  a  silicon  chip.  For  a  recent  review  of  this  technology 
see,  for  example,  reference  [1].  We  have  constructed  a  256  by  256  binary  SLM  for  an 
optical  correlator  application  which  uses  a  chiral  smectic  C  (SmC*)  ferroelectric  liquid 
crystal  as  the  light  modulating  layer  [2].  We  have  also  constructed  a  128  X  128  analog 
SLM  which  uses  the  Distorted  Helix  Ferroelectric  (DHF)  [3]  effect  and  in  this  paper  we 
present  some  early  results  from  this  device. 


2.  The  256  by  256  binary  SLM 

The  SLM  is  based  on  a  7mm  by  7mm  CMOS  die  fabricated  by  US2  under  a  1.2  micron 
design  rule.  The  active  array  area  is  a  square  approximately  5.53mm  on  a  side.  Each 
pixel  in  the  array  is  addressed  by  a  row  select  wire  and  a  column  data  wire.  The  row 
wire  activates  the  pixel  transistor  gate  and  the  column  wire  presents  a  binary  data  signal 
to  the  pixel.  On  activation  of  the  gate  wire  the  data  on  the  column  wire  is  written  to 
the  pixel,  where  it  is  capacitively  stored.  The  resulting  electric  field  between  the  pixel 
mirror  and  a  transparent  electrode  on  a  piece  of  cover  glass  drives  the  liquid  crystal  into 
the  desired  state. 
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Array  size 

256  by  256 

Pixel  Pitch 

21.6/im 

Fill-factor  ; 

79% 

Flat  fill-factor 

60% 

Diffraction  efficiency 

15% 

Throughput  into  zero  order 

3.4% 

Demonstrated  frame  load  time 

43/is 

Simulated  frame  load  time 

27^s 

Contrast  ratio 

70:1  zero  order  (10:1  imaged) 

LC  switching  time 

75/is 

Table  1.  Specifications  of  and  results  from  the  256  by  256  binary  SLM. 


Data  are  transferred  to  the  SLM  over  32  parallel  lines  under  the  control  of  a  master 
clock  and  a  frame  sync  signal.  We  have  demonstrated  the  addressing  of  the  SLM  with  a 
master  clock  frequency  of  48MHz  which  gives  an  image  refresh  rate  of  23.5kHz.  This  is 
data  rate  of  1.6Gb/s  from  the  driver  board  to  the  SLM.  To  achieve  these  data  rates  the 
SLM  backplane  utilised  on-chip  clock  and  control  signal  generation  and  data  pipelining. 

2.1.  Results. 

A  cover-glass  was  fixed  over  the  pixel  array,  spaced  with  polyimide  balls,  and  the  gap 
filled  with  BDH  SCE13  liquid  crystal  by  capilliary  action  under  vacuum.  The  liquid 
crystal  was  aligned  by  means  of  a  rubbed  PVA  layer  on  the  cover  glass.  The  device 
specifications  are  summarized  in  table  1. 

Two  of  these  SLMs  have  been  used  in  an  optical  correlator  as  input  and  Fourier 
plane  devices  operating  in  a  binary  phase  mode.  The  system  was  operated  at  1000 
correlations  per  second,  i.e.  a  positive  input  and  filter  pattern  were  written  for  500/is, 
followed  by  their  inverses  for  500/^s.  The  camera  shutter  was  set  at  1ms  to  capture 
the  output  from  both  true  and  inverse  frames.  On  and  off-axis  inputs  consisting  of 
small  targets  that  used  0.6were  used.  With  the  laser  producing  approximately  5mW  the 
correlation  peaks  in  the  output  were  able  to  saturate  the  camera  and  yield  a  signal  to 
noise  (pk/rms)  of  greater  than  lOdB  for  most  of  the  images. 


3.  The  128  by  128  analog  SLM 

The  silicon  backplane  for  this  device  was  fabricated  on  a  2  micron  CMOS  process 
through  the  MOSIS  brokerage.  Sixteen  analog  input  lines  are  sampled  and  their  voltage 
values  routed  to  the  pixels  under  the  control  of  an  on-chip  controller.  The  analog  data 
are  generated  by  an  external  driver  board  which  stores  the  data  as  S-bit  numbers.  Fast 
current-output  DACs  convert  these  data  into  signals  which  are  sent  to  a  header  board 
on  which  the  SLM  is  mounted.  The  analog  currents  are  converted  to  voltage  signals  by 
op-amps  on  the  header  board.  The  factor  which  most  limits  the  frame-load  time  of  the 
SLM  is  the  settling  time  of  the  op-amps.  By  allowing  100ns  for  their  outputs  to  settle 
we  have  a  frame-load  time  of  100/is. 
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Figure  1.  The  output  from  a  group  of  pixels  on  the  analog  SLM  switching  from 
fully-off  to  fully-on.  The  rise  and  fall  times  are  ~  235/iS. 
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Figure  2.  The  output  from  a  group  of  pixels  on  the  analog  SLM  switching  through 
a  succession  of  16  gray  scales.  The  lower  trace  is  the  output  intensity  and  the  upper 
trace  is  proportional  to  the  analog  input  voltage  that  was  applied.  The  response 
is  not  quite  linear  so  a  correction  was  estimated  and  applied  to  the  input  signal  to 
generate  the  approximately  linear  output  scale  shown  here. 

The  SLM  coverglass  was  treated  with  a  rubbed  PVA  alignment  layer  and  the  cell 
was  filled  with  the  DHF  material  Hoffmann  la  Roche  9807.  This  alignment  scheme  is  not 
ideal  for  this  DHF  material  and  so  we  find  that  there  is  a  lot  of  scattering  from  the  liquid 
crystal  in  the  “off”  state.  We  have  performed  experiments  on  test  cells  using  different 
alignment  materials  and  cooling  schemes  and  we  are  confident  that  these  techniques  will 
yield  improvements  in  the  alignment  quality  in  future  SLMs. 

3.1.  Results  from  the  analog  128  by  128  SLM. 

The  scattering  from  the  alignment  defects  is  apparent  in  the  measurement  of  the  contrast 
ratio  of  this  SLM.  The  device  was  illuminated  by  a  He-Ne  laser  {633nm)  and  the  contrast 
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Figure  3.  Contour  plots  of  the  optical  Fourier  transform  of  the  analog  SLM  when 
it  was  displaying  a  binary  stripe  pattern  (left),  and  a  sinusoidal  stripe  pattern 
(right).  The  SLM  was  operated  in  phaise  mode. 


from  a  group  of  pixels  comprising  about  half  the  SLM  array  was  measured.  If  only  the 
zeroth  diffracted  order  is  collected  then  the  contrast  ratio  is  30:1,  however  the  scattering 
from  the  imperfect  alignment  leads  to  a  rapid  fall-off  in  contrast  as  the  acceptance  angle 
of  the  output  optics  is  increased.  If  the  optics  collects  the  zeroth  order  and  the  ± 
first  orders  the  contrast  drops  to  14:1  and  continues  to  drop  to  at  the  ±5th  and 

6th  orders.  Switching  speed  results  from  the  analog  SLM  are  shown  in  figure  3.  This 
measurement  wais  made  by  imaging  a  region  of  the  SLM  array  onto  a  photomultiplier  and 
writing  alternately  10  fuI!y-on  frames  and  10  fully-off  frames  to  the  SLM.  The  results  of 
a  similar  experiment  are  shown  in  figure  3.  In  this  experiment  a  set  of  16  frames  are  used 
to  address  the  SLM  and  generate  16  gray  levels.  Figure  3.1  show  the  result  of  operating 
the  SLM  in  phase  mode.  A  binary  phatse  stripe  pattern  yields  the  image  on  the  left  and 
the  analog  sinusoidal  pattern  of  the  same  spatial  frequency  yields  the  image  on  the  right. 
Using  the  SLM  in  this  way  is  equivalent  to  having  one  bit  of  pure  phaise  information  and 
the  rest  of  modulation  in  amplitude. 
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High-speed  liquid  crystal  on  silicon  spatial  light  modulators 

C.  C.  Mao,  D.  McKnight,  and  K.  M.  Johnson 
The  Center  for  Optoelectronic  Computing  Systems 
University  of  Colorado  at  Boulder 
Boulder,  Colorado  80309-0525 

The  results  of  a  64  x  64  high-voltage  CMOS  array  which  produced  30  V  at  each  pixel 
resulting  in  a  liquid  crystal  switching  time  of  30  /iS  are  described. 

I.  Introduction. 

Liquid-crystal-on-silicon  (LCOS)  spatial  light  modulators  (SLMs)  have  applications  to 
displays,  optical  beamsteering,  and  optical  information  processing.  In  these  devices,  liquid 
crystals  (LCs)  are  sandwiched  between  a  silicon  VLSI  backplane  and  a  piece  of  glass  coated 
with  a  transparent  conductive  electrode.  Voltages  are  applied  to  metal  mirrors  on  the  silicon 
backplane  to  electrically  switch  a  ferroelectric  liquid  crystal  (FLC).  This  results  in  orienting 
the  FLC  optic-axis  to  two  states.  A  polarized  optical  beam  incident  upon  the  device  from 
the  LC  side  is  reflected  from  the  metal  mirrors,  resulting  in  a  spatial  modulation  of  the  input 
light’s  polarization.  A  polarizer  placed  at  the  output  of  the  SLM  can  convert  the  polarization 
modulation  to  intensity  or  phase  modulation. 

The  FLC  switching  speed  is  inversely  proportional  to  the  applied  electric  field  E.  With 
a  standard  2  /im  CMOS  process,  the  VLSI  circuitry  provides  voltages  between  0  V  and  5 
V  at  each  metal  mirror  for  switching  the  liquid  crystal.  In  general,  such  low  voltages  result 
in  FLC  switching  times  ranging  between  0.1  ms  and  0.5  ms.^“^  Recently,  McKnight  et  al 
fabricated  a  LCOS  spatial  light  modulator  with  a  1  fim  FLC  layer,  resulting  in  a  2,5  V//zm 
applied  E  field  and  a  10%  -  90%  rise  time  of  75  fis.^  If  higher  voltages  are  applied  to  the 
metal  mirrors,  faster  FLC  response  times  should  be  possible.  In  this  paper,  we  describe  a 
high-voltage  (HV),  64  x  64  LCOS  SLM  that  provides  30  V  to  each  metal  mirror  resulting  in 
a  LC  switching  time  of  30  fxs. 

II.  High-voltage  CMOS  array  design 

The  maximum  voltage  a  transistor  can  switch  is  limited  by  the  junction  breakdown 
voltage.  Transistor  junction  breakdown  voltage  is  inversely  proportional  to  the  impurity 
concentration  in  the  diffusion  regions.  This  implies  that  using  a  lower  doping  concentration 
for  the  transistor  drains  should  extend  their  voltage  handling  capability.  This  approach  is 
applied  to  the  structure  shown  in  Fig.  1.  The  n-fet  uses  the  n-well  (10^^/cm^)  a,s  its  drain, 
which  has  a  lower  doping  than  n-diffusion  (10^°/cm^),  The  p-fet  uses  p-base  (10^^/cm^)  as 
its  drain,  which  has  a  lower  doping  than  p-diffusion  (10^^/cm^).  Neither  of  these  layers  is 
mcLsked  by  polysilicon  during  implantation,  and  hence  are  not  self-aligned.  Care  must  be 
taken  to  ensure  overlap  with  a  thin  oxide  gate  region  to  provide  channel  continuity.  In  our 
design,  this  overlap  is  3  fim.  The  contact  to  these  drains  is  surrounded  by  the  lightly  doped 
drain  material  to  isolate  the  surrounding  substrate  or  well. 

Our  previous  test  results  showed  that  the  HV  n-fet  can  accommodate  voltages  up  to  30 
V,  while  the  HV  p-fet  can  switch  voltages  up  to  15  V.^  In  general,  a  voltage  of  ±10  V//im 
is  required  for  switching  FLCs  with  the  highest  speed  and  the  largest  optic-axis  switching 
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angle.  Therefore,  to  operate  the  LCOS  device  with  a  LC  layer  thickness  of  1  fim  or  larger 
under  these  optimum  conditions,  a  voltage  higher  than  10  V  is  required.  To  obtain  bipolar 
voltages  with  peak  amplitudes  higher  than  10  V,  the  circuit  must  operate  with  voltages  higher 
than  20  V.  This  voltage  exceeds  the  p-fet  breakdown  voltage.  There  are  two  approaches  to 
overcome  this  limitation:  1)  use  a  lighter  doping  p-fet  drain  region,  2)  drop  the  voltage  across 
several  p-fets  in  series.  We  chose  the  latter  approach,  because  we  cannot  specify  the  doping 
concentration  of  integrated  circuits  processed  through  the  University  of  Southern  California 
MOS  Implementation  Service  (MOSIS).^ 

The  circuit  which  can  accommodate  voltages  up  to  30  V  is  shown  in  Fig.  2.  The  HV 
addressing  circuit  includes  four  HV  p-fets  and  two  HV  n-fets.  The  bicis  voltage,  14,  is 
approximately  half  of  the  HV  14,.  The  select  signals,  Vi  and  14,  are  0  V  and  5  V  signals 
generated  by  a  standard  CMOS  control  circuit.  When  14  =  5  V  and  14  =  0  M,  The  output 
voltage,  14ut,  is  charged  to  14,.  The  voltage  at  the  node  between  p-fets  4  and  5  is  equal  to  14,- 
and  the  voltage  on  the  gate  of  p-fet  1  is  equal  to  14,;  thus  closing  p-fet  1.  The  node  between 
p-fet  2  and  n-fet  3  is  discharged  to  0  V.  In  this  case,  the  voltage  at  the  node  between  p-fets 
1  and  2  is  approximately  14,72.  The  voltage  across  p-fets  3  and  4  is  0  V.  If  14  =  0  V  and  14 
=  5  V,  the  resulting  voltages  are  reversed.  Therefore,  this  insures  that  the  drain-to-source 
voltages  across  any  p-fet  do  not  exceed  14, /2,  and  this  circuit  can  be  operated  with  voltages 
up  to  30  V.  Another  advantage  of  this  approach  is  that  the  control  circuits  can  be  fabricated 
with  conventional  CMOS  transistors. 

We  have  designed  and  fabricated  a  high-voltage  VLSI  array  with  64  x  64  pixels  using  this 
circuit  to  convert  low- voltages  to  high  voltages  (see  Fig.  3).  The  array  was  fabricated  in  the 
2-fim  n-well  process  through  the  MOSIS  foundry.  Pixels  are  located  on  48  fim  centers  and 
each  pixel  contains  one  HV  n-fet  and  a  metal  mirror.  The  n-fet  consists  of  n-well  (drain), 
n-diffusion  (source),  and  polysilicon  (gate).  The  modulating  mirror  is  a  metal2  layer,  which 
does  not  overlap  the  transistor.  The  pixel  flat  fill  factor  is  60%.  The  array  is  addressed  line 
at  a  time  using  row  shift  registers  located  to  the  left  and  right  sides  of  the  array,  comprised 
of  the  conventional  CMOS  transistors  and  the  HV  circuits  for  converting  low  voltages  to 
high  voltages.  Data  are  input  onto  the  columns  using  16  parallel  data  lines  and  multiplexers 
positioned  at  the  top  and  bottom  of  the  chip. 

III.  Experimental  results 

A  HV  LCOS  SLM  was  cissembled  and  tested.  The  liquid  crystal  used  is  the  British  Drug 
House  smectic  C*  FLC  SCE-13.  The  functionality  of  the  device  was  tested  by  writing  binary 
images  onto  the  array  and  observing  the  output  optical  images  under  a  polarizing  microscope. 
A  voltage  equal  to  V^^i/2  is  applied  to  the  common  electrode  on  the  cover  glciss.  A  voltage  of 
0  V  on  the  metal  mirror  results  in  the  local  optic-axis  of  the  FLC  to  be  switched  to  what  we 
define  as  the  OFF  state.  When  a  metal  mirror  is  charged  to  voltage  V/,,,  the  local  optic-axis 
of  the  FLC  is  switched  to  what  we  define  as  the  ON  state.  When  a  polarized  uniform 'light 
beam  illuminates  the  device,  the  reflected  light  from  metal  mirrors  with  \4,  ideally  undergoes 
90°  polarization  rotation,  and  is  transmitted  by  an  output  analyzer.  The  reflected  light  from 
metal  mirrors  with  0  V  is  not  polarization  rotated,  and  hence  extinguished  by  an  output 
analyzer.  Fig.  4  shows  an  example  of  output  optical  images  from  the  SLM. 

We  also  tested  the  response  times  and  ON/OFF  contrast  ratio  of  the  SLM.  Fig.  5  gives 
an  oscilloscope  trace  showing  the  switching  times  of  the  FLC  with  an  applied  voltage,  Vhi  = 


Figure  2.  High-voliage  addressing  circuit 
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Figure  3.  Photograph  of  the  fabricated  high-voltage  chip 
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Figure  4.  Photograph  showing  an  output  optical  patent 
from  high-voltage  LCOS  spatial  light  modulator. 


Figure  5.  Oscilloscope  trace  showing  a  FLC  switching 
lime  of  30  us. 
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25  V.  The  10%  -  90%  FLC  switching  lime  is  approximately  30  us.  The  measured  ON/OFF 
contrast  ratio  {Ion ! ^off)  is  13:1. 

IV.  Conclusions 

We  have  designed,  fabricated,  and  tested  a  high-voltage,  64  x  64  array  liquid  crystal  on 
silicon  spatial  light  modulator.  The  SLM  can  switch  30  V  at  the  pixel  mirrors,  and  has  a 
10%  -  90%  liquid  crystal  switching  time  of  30  pis.  The  contrast  ratio  of  it  is  13:1. 
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Abstract.  Recent  advances  in  electronically  addressed  ferroelectric  liquid  crystal 
over  silicon  spatial  light  modulators  include  the  design  of  devices  of  increased  speed, 
larger  pixel  count  and  greater  optical  efficiency.  We  present  recent  results  illustrat¬ 
ing  each  of  these  features. 

1.  Background 

The  Spatial  Light  Modulator  (SLM)  is  a  key  component  for  many  optical  com¬ 
puting  systems.  Historically,  weak  SLM  performance  has  been  a  limiting  factor  for 
some  systems.  The  SLM  technology  of  Ferroelectric  Liquid  Crystal  over  Very  Large 
Scale  Integrated  (FLC/VLSI)  silicon  has  matured  considerably  in  recent  years.  Elec¬ 
tronically  Addressed  SLM’s  (EASLM’s)  of  medium  resolution  (128^  pixels  or  more) 
have  been  reported  by  several  labs  [1];  a  working  system  containing  multiple  devices 
have  been  reported  [2]. 

2.  Introduction 

FLC/VLSI  SLMs  are  produced  by  sandwiching  a  thin  layer  (~  Ifim)  of  FLC  between 
a  custom  designed  silicon  backplane  and  a  cover  glass  coated  on  the  inside  with  a 
transparent  conductive  electrode  [1].  Most  EASLM  backplane  designs  are  based  on 
a  pixel  circuit  consisting  of  one  active  element  -  a  Metal-  Oxide-Semiconductor  Field 
Effect  Transistor  (MOSFET)  which  acts  as  a  switch  to  control  the  amount  of  charge 
stored  on  a  capacitive  storage  element  -  a  small  metal  mirror  on  the  surface  of  the 
silicon.  The  voltage  thus  generated  produces  an  electric  field  which  alters  the  state 
of  a  the  trapped  layer  of  FLC  to  produce  a  binary  phase  or  amplitude  modulation  in 
an  incident  wavefront. This  scheme  provides  the  smallest  possible  pixel  and  thus  the 
highest  density  of  pixels;  it  is  analogous  to  the  purely  electronic  Dynamic  Random 
Access  Memory  (DRAM)  cell. 

This  type  of  pixel  suffers  from  light  induced  charge  leakage  which  is  manifest  as  a 
reduction  in  contrast  ratio  with  increasing  intensity  of  incident  light,  thus  limiting 
the  maximium  light  level  at  which  it  can  be  operated.  There  is  also  a  limit  to  the 
spontaneous  polarization,  of  the  FLC  material  which  can  be  used  -  the  pixel 
capacitor  must  store  enough  charge  to  switch  it.  The  drive  to  reduce  the  pixel  size 
is  compromised  by  the  need  to  maintain  an  optically-flat  metal  area  to  act  as  the 
reflective  aperture  (or  mirror).  Underlying  circuit  elements  such  as  transitors  or 
interconnect  cause  undulations  in  the  overlying  part  of  the  mirror;  these  can  cause 
nonuniform  optical  contrast  across  a  mirror,  losses  due  to  scattering  and,  in  coherent 
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systems,  phase  variations.  A  flat  fill  factor  [5],  F3  (flat  mirror  area  /  pixel  area),  of 
around  25%  has  become  a  de  facto  minimum.  More  quantitatively  if  we  consider  the 
pixel  area,  Apia;,  to  be  made  up  of  two  components  -  the  fiat  mirror  area,  Amirror, 
and  the  rest  of  the  pixel  (bus  lines,  transistors  etc.),  Adrc,  we  can  write 


(1) 


Within  a  given  CMOS  process  (which  fixes  the  minimum  size  of  the  circuit  elements 
and  the  maximum  die  size)  a  given  pixel  functionality  (which  fixes  the  number  of 
circuit  elements  required)  there  is  a  direct  trade  off  between  flat  fill  factor  and  pixel 
area  (or  maximum  number  of  pixels). 


3.  512  X  512  DRAM  SLM 


We  have  previously  reported  a  176  x  176  DRAM-style  pixel  array  [3]. The  frame 
rate  was  limited  to  around  IkHz  (dc  balanced)  by  the  RC  time  constant  of  the  rel¬ 
atively  high  resistance  polysilicon  row  access  lines.  A  512  x  512  pixel  array  (based 
upon  the  original  176  x  176  device)  has  been  designed.  The  primary  modification 
has  been  the  use  of  aluminium  to  replace  the  polysilicon  row  access  lines  within 
the  pixel  array.  The  finished  design  has  been  fabricated  by  Austria  Mikro  System 
and  is  undergoing  electrical  testing.  A  design  flaw  has  been  detected  in  the  on-chip 
address  circuits;  the  drive  sequence  has  been  altered  in  software  in  order  to  provide 
a  temporary  workaround  to  this  problem. 

4.  256  X  256  SRAM  SLM 


An  alternative  to  the  single  transistor  pixel  design  above  is  based  around  an  enhance¬ 
ment  of  the  six  transistor  Static  RAM  (SRAM)  cell.  The  enhancement  involves  in¬ 
serting  a  simple  logic  gate  between  the  memory  node  and  the  pixel  mirror /electrode 
node  so  as  toallow  the  addressing  requirements  of  the  FL(i^  to  be  met  more  easily. 
The  SRAM  design  overcomes  all  of  the  above  disadvantages  of  the  DRAM  pixel  at 
the  expense  of  increased  transistor  count  leading  to  increased  pixel  area.  In  partic¬ 
ular  it  maintains  its  state  indefinitely,  allows  the  use  of  the  fast-switching  high  Pg 
FLC  materials  and  shows  no  variation  of  contrast  ratio  with  incident  light  intensity 
over  a  wide  range  of  intensity.  We  have  demonstrated  a  fully  working  256  x  256  pixel 
array  built  in  1.2/im  CMOS  technology.  The  contrast  ratio,  measured  directly  by 
imaging  pixels  is  around  9  —  10  :  1.  The  FLC  switching  speed  has  been  measured  at 
150/.is  (rise)  and  bofis  (fall)  with  a  supply  voltage  to  the  backplane  of  Vdd  =  6V^,  see 
Figure  1.  This  asymmetry  indicates  a  likelihood  that  the  FLC  in  these  particular 
devices  is  not  bistable  but  tending  towards  a  monostable  configuration.  Further 
investigations  are  underway. 

5.  Increasing  Pixel  Flat  Fill  Factor 

We  have  succesfully  applied  a  post-processing  wafer  planarisation  technique  to  the 
176  X  176  DRAM  and  256  x  256  SRAM  devices.  This  has  allowed  an  optically  flat 
mirror  to  be  placed  on  top  of  the  existing  circuitry.  The  technique  involves  the 
deposition  of  a  thick  dielectric  layer  which  is  subsequently  polished  flat;  this  faciliti- 
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Figure  1:  FLC  optical  response 


ates  the  deposition  of  a  further  metal  layer  which  is  patterned  to  form  a  flat  mirror 
covering  almost  the  entire  pixel.  The  planarisation  technique  is  described  in  detail 
elsewhere  [4].  FLC  cell  construction  has  been  carried  out  succesfully  on  planar- 
ised  backplanes.  The  increased  flat  fill  factor  of  the  backplane  increases  the  light 
throughput  of  the  finished  SLM  and  reduces  the  stray  light  reaching  the  substrate. 
For  a  SLM  designed  with  planarisation  in  mind,  it  is  possible  to  achieve 


coupled  with  a  high  flat  fill  factor 


(2) 


F3  = 


2 


p  —  m 
P 


(3) 


where  p  is  the  pixel  pitch  and  m  is  the  minimum  allowed  gap  between  adjacent 
mirrors.  Figure  2  shows  the  effect  of  planarisation  on  the  same  image  on  a  small 
part  of  a  256  x  256  SRAM  SLM.  The  via  contact  leaves  a  small  dark  patch  even  on 
the  planarised  pixels. 


Figure  2:  Part  of  SLM  displaying  pattern,  unplanarised  (left)  and  planarised 

6.  Summary 

Table  1  summarises  the  present  situation.  High  resolution  devices,  based  on  both 
DRAM  and  SRAM  pixels,  have  been  built.  The  planarisation  process,  which  signi¬ 
ficantly  enhances  device  performance,  has  been  demonstrated  on  two  SLM  backplane 
designs.  It  is,  in  principle,  equally  applicable  to  all  of  the  devices  of  Table  1  and 
to  any  future  devices,  including  smart  pixel  SLM’s.  Research  has  commenced  on  a 
method  of  via  filling  to  remove  the  effect  present  in  Figure  2. 
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Device 

176  DRAM 

512  DRAM 

16  SRAM 

50  SRAM 

256  SRAM 

Date 

1989 

1994 

1986 

1988 

1994 

Fab  process 

Slim 

3/im 

6/im 

1.5/iTn 

1.2^772 

CMOS 

CMOS 

nMOS 

nMOS 

CMOS 

Pixel  pitch  {jim) 

30 

30 

200 

72 

40 

Mirror  size 

Nominal  [jirm?) 

15  X  11 

15  X  11 

no  X  110 

40  X  40 

19  X  19 

after  planar’n 

27  X  27 

27  X  27t 

196  X  196 

NA 

37  X  37 

Fs  (%)  before  /  ' 

23  /  85 

after  planar’n 

18  /  81 

18/81 

30  /  96 

31  /  NA 

Elec  address 

time  (pLs) 

250 

lOOOt 

1 

N/A 

85 

Frame  rate  (Hz) 

1000 

250* 

5 

20 

4000 

Table  1:  Silicon  backplanes  for  FLC/VLSI  EASLM’s  (+  design  estimate) 

Given  the  current  (minimum  feature  size  and  maximum  die  size)  limitations  of 
using  commercial  silicon  vendors  it  is  possible  to  pursue  the  DRAM  approach  to 
1024  X  1024  and  the  SRAM  to  512  x  512  with  frame  rates  comparable  to,  or  better 
than,  those  of  Table  1.  Access  to  more  specialised  current  fabrication  processes,  or 
the  ongoing  process  development  of  the  commercial  silicon  vendors,  will  allow  the 
design  and  fabrication  of  even  higher  performance  SLM’s.  Even  then,  the  highest 
performance  is  only  likely  to  be  achieved  through  custom  post-processing  to  produce 
an  optically  flat  and  FLCTriendly  surface  layer. 
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ABSTRACT 

We  present  the  sensitivity  transition  characteristics  of  LAPS-SLM.  Furthermore,  optical  neural  network 
that  is  using  the  LAPS-SLM  as  a  thresholding  device  will  be  presented.  The  sensitivity  transition 
coefficient  is  controllable  by  the  driving  waveform  or  the  light  intensities. 

In  the  concretely  system,  the  LAPS-SLM  will  be  used  as  the  weighting  and  thresholding  device.  A  new 
learning  scheme  by  the  sensitivity  transition  of  LAPS-SLM  will  be  presented  and  discussed. 


1.  Introduction 


Interest  in  optical  processors,  optical  correlators,  and  optical  neural  network  systems  has 
grown  due  to  their  inherent  high  speed  and  parallelism.  Spatial  light  modulators  (SLMs)  are 
basic  components  in  these  systems. We  have  been  developing  a  spatial  light  modulator 
using  a  hydrogenated  amorphous  silicon  photoconductor  and  a  surface  stabilized 
ferroelectric  liquid  crystal  light  modulator  (LAPS-SLM).(2).(3) 

Owing  to  the  driving  pulse  voltage  threshold,  FLC-SLMs  have  a  definite  write  light 
intensity  threshold.  As  is  discussed  concerning  about  the  matrix  displays,  the  accumulation 
effect  is  immanent  in  the  devices  using  SSFLC  mode.  The  accumulation  effect  is  actualized 
in  the  SLM  as  the  sensitivity  transition.  This  time,  we  considered  about  the  sensitivity 
transition  characteristics  of  LAPS-SLM,  and  would  propose  the  application  of  these 
nonlinear  characteristics  for  the  optical  neural  network  system. 


2.  Device  fabrication  and  experimental  conditions 


The  structure  of  the  LAPS-SLM  is  shown  in  Fig.  1.  The  alignment  layers  are  provided  by 
an  oblique  evaporation  of  SiO.  The  input  and  output  windows  are  clamped  together  and  the 
FLC  material  is  introduced  by  the  capillary  method.  The  writing  experiments  are  made  on  a 
microstage  of  the  polarization  microscope.  Write  light  is  550nm  with  half-width  of  4()nm, 
and  readout  light  is  630nm  with  half-width  of  80nm.  The  polarizer  and  the  analyzer  are  set 
in  the  crossed  Nicols  state.  In  these  experiments,  driving  waveform  is  bipolar  pulse  that  is 
shown  in  Fig.  2. 


Glass 


Write 

Light 


ITO 

FLC 


Read 

Light 


Alignment  Layers 
Schematic  illustration  of  LAPS-SLM 


erase  pulse 


Fig.  2  Standard  driving  waveform 
Erase  pulse  :  1 0V.  2ms 
Writing  pulse  :  -10V,  1ms 
Readout  period  :  7ms 
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is  modified 

O:200|i  W/cm2 , 0:400^  W/cm2  ,  A:600^i  W/cm2 


Log  t 


Fig.  4  Sensitivity  transition  when  erase 
pulsevoltage  is  modified 
O:  12.5V,  □:  10V,  A:  7.5V 


Fig.  5  Sensitivity  transition  when  erase  pulse 
width  is  modified 
O:  0.75ms,  □:  1ms,  A;  1.5ms 


3.  Results  and  discussions 

Write  light  intensity  (/),  erase  pulse  voltage  (p)  and  erase  pulse  width  {q)  are  modified,  and 
the  transition  of  the  threshold  write  light  intensities  (Th/)  are  measured.  Sensitivity  is 
defined  as  Th/  /  TH/g,  where  TH/q  is  the  threshold  write  light  intensity  at  the  initial.  Figs. 
3,  4  and  5  show  the  sensitivity  transition  of  LAPS-SLM.  Under  any  condition,  the 
threshold  write  light  intensity  is  decreases  monotonously.  Fig.  3  shows  the  dependence  of 
the  sensitivity  transition  on  the  write  light  intensity  (/).  Driving  waveform  used  is  standard 
that  is  shown  in  Fig.  2  and  the  write  light  intensity  is  modified  as  200,  400  and 

600^W/cm2.  Readout  light  is  irradiated  continuously,  and  the  measurement  is  made 
individually.  Fig.4  shows  the  dependence  of  the  sensitivity  transition  on  the  erase  pulse 
voltage  (p).  Fig.  5  shows  the  dependence  of  the  sensitivity  transition  on  the  erase  pulse 
width  iq).  These  data  show  that  the  sensitivity  transition  characteristics  are  proportional  to 
logt ,  and  the  coefficients  could  be  controllable  by  1 ,  p  and  q.  Fig.  6  shows  the  mitigation 
characteristic  after  10  minutes  continuous  writing.  These  mitigation  characteristics  are  also 
proportional  to  log?.  Here,  1  would  be  able  to  write  the  accumulation  coefficient 

(7=-alogr  (a=(j)(/,  p,  q)),  and  the  mitigation  coefficient  h  =|3logr  (p-constant). 

Thes^accumulation  and  mitigation  characteristics  are  phenomenally  supposed  to  similar  to 
the  memory  and  the  forget  in  the  brainU)(5)^  \  would  consider  about  the  application  of 
these  characteristics  for  the  optical  neural  network.  To  control  the  accurnularton 
characteristics,  it  seems  that  p  or  q  would  be  adequate.  Howevei,  the  effect  of  /  is  slight 
different  because  as  shown  in  Fig.  3,  dependence  of  the  accumulation  on  /  has  the  minimal 
value.  These  characteristics  would  have  a  possibility  to  realize  the  negative  shift  when  the 
renewing  of  the  weight  matrix. 

Here,  1  would  like  to  consider  about  the  cause  of  the  sensitivity  transition  by  using  Figs. 
7.  Fig.’ 7-1  shows  the  remaining  of  counter  potential  in  a  memory  state  of  SSFLC.  When 
the  optical  modulation  is  ensured,  FLC  molecules  are  turn  around  the  cone  and  the  reversal 
of  the  spontaneous  polarization  is  befallen. 
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7-2  :  electron  drift  in  a-Si:H  layer 
and  accumulation  to  surface 
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7-3  :  purpose  of  write  light  intensity  to 
mobility  of  electrons  in  a-Si:H  layer 


Figs.  7  Examination  of  accumuration  effect 


As  the  reversal  of  the  spontaneous  polarization,  flowing  of  polarization  reversal  current 
and  the  drifting  of  the  comprised  ions  are  turned  out  simultaneously.  When  in  the  memory 
state,  the  counter  potential  due  to  the  drifted  ions  is  survived.  Corresponding  to  the  drifted 
ions  in  the  FLC  layer,  photonically  generated  electrons  are  drifted  and  accumulated  to  the 
surface,  as  shown  in  Fig.  7-2.  These  accumulated  electrons  would  act  to  decrease  the  write 
light  intensity  threshold  level.  The  sensitivity  transition  characteristics  are  controllable  by 
the  write  light  intensity  and  erase  pulse.  It  would  be  very  important  when  these 
characteristics  were  applied  to  the  actual  systems.  However,  the  dependence  of  the 
sensitivity  transition  on  the  write  light  intensity  exhibits  the  extreme  value  shown  in  Fig.  3. 
It  would  be  explained  by  the  difference  of  the  low  resistivity  area  generated  by  the  write 
light,  as  shown  in  Fig.  7-3.  If  the  sufficiently  strong  write  light  is  irradiated,  the  drifted 
electrons  easily  frown  out  and  the  accumulation  effect  would  be  slackened. 


4.  Application  for  optical  neural  networks 

Recently,  the  optical  neural  networks  are  actively  investigated.  Essentially  the  optical 
systems  have  very  well  matching  for  neural  network  because  of  the  inherent  parallelisms. 
For  instance,  an  optical  neural  network  system  was  presented  by  Psaltis  and  Farhat.^^) 
However,  performances  of  the  optical  neural  network  systems  that  have  been  presented  are 
critically  eliminated  to  the  step  of  thresholding  and  feedback.  SLMs  would  be  in  charge  of 
the  mainly  part  of  the  optical  neural  network  systems,  and  in  addition,  LAPS-SLM 
possesses  very  well  performances  as  the  function  of  memory  and  thresholding,  higher 
contrast,  higher  resolution  and  faster  response  time,  to  this  application. 

From  the  data  shown  in  the  Figs.  3  and  4,  it  seems  that  the  LAPS-SLM  would  work  as  a 
memory  like  a  brain  when  it  was  used  in  the  optical  neural  network  system.  When  the 
LAPS-SLM  are  used,  vector-matrix  multiplexing  is  managed  optically,  and  of  course  the 
LAPS-SLM  is  useful  on  the  thresholding  plane.  Moreover,  positive  application  of  the 
sensitivity  transition  effect  of  the  LAPS-SLM  brings  the  learning  operation  on  the  threshold 
level.  If  the  LAPS-SLM  is  used  as  the  vector-matrix  multiplexer,  the  weight  matrix  on  the 
multiplexing  plane  and  the  output  vectors  are  binary,  but  the  threshold  levels  are 
continuously  variable  and  are  controllable.  Namely,  LAPS-SLMs  work  as  the  neural 
memories  that  learn  on  the  threshold  levels.  The  nonlinearity  is  assigned  from  the 
accumulation  and  the  mitigation  times. 

By  fig.  8,  I  would  like  to  propose  the  neural  network  system  using  the  LAPS-SLMs  as 
the  vector-matrix  multiplexer  and  thresholding  device.  The  sequence  of  the  work  of  this 
system  is  as  follows.  First,  the  primer  weight  matrix  W(N)  is  displayed  on  the  LCTV  and 
is  written  and  latched  on  the  LAPS-SLM  1.  Next,  the  weight  matrix  W(N)  on  the  LAPS- 
SLMl  is  red  out  by  the  light  of  input  vector  Vi  that  is  generated  from  the  LED  array  and 
expanded  by  the  cylinder  lens  CLl.  Then,  the  optical  vector-matrix  multiplexing  is 
performed. 
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Fig.  8  Schematic  illustration  of  optical  neural 
network  system  applying  the  LAPS-SLM 


The  output  light  is  contracted  by  the  cylinder  lens  CL2,  and  projected  on  the  LAPS- 
SLM2.  At  this  time,  thresholding  are  executed  and  the  result  vector  f(SW(N)Vi)  is 
memorized  on  the  LAPS-SLM2.  The  thresholding  function  f(x)  on  the  LAPS-SLM2  is 
written  as  f(x)  =  F[xi-h).  Where  F  is  a  step  function  by  the  bistability  of  the  FLC 
modulator,  h  is  the  threshold  level  of  the  LAPS-SLM2  and  is  controllable  by  the 
accumulating  time  t,  the  input  light  intensity  /,  driving  pulse  voltage  p  and  pulse  width  q  as 
shown  in  Figs.  3,4,5  and  6.  Output  is  obtained  by  the  PD  array  and  the  weight  matrix  on 
the  LAPS-SLM  1  is  renewed  by  these  results.  The  learning  is  carried  out  by  this  renewing 
of  the  weight  matrix  submitting  to  the  delta  rule.  The  renewing  of  the  weight  matrix  is 
written  as, 

W(N+\)={l-byW{N)+a  -SWCN). 

Where  W(N)  is  the  weight  matrix  before  renewing,  a  is  the  learning  coefficient,  b  is 
forgetting  coefficient  and  6W(N)  is  the  learning  by  the  delta  rule.  Providing,  the  iteration 
number  N  is  already  including  the  effect  of  the  accumulation  and  mitigation  time.  The 
learning  and  forgetting  coefficients  a  and  b  are  also  carried  out  by  the  sensitivity  transition 
characteristic  of  the  LAPS-SLM,  and  written  as, 

=plogf  (p=constant),  =-alogr  (a=())(/,  p,  r/)). 

These  expressions  show  a  new  learning  method  introducing  the  effect  of  learning  and 
forgetting  time.  In  this  system,  convergence  characteristics  are  affected  by  learning  and 
forgetting  time,  and  would  be  controllable. 

Same  as  the  vector-matrix  plane,  learning  of  the  threshold  level  could  be  realized  on  the 
thresholding  plane  where  the  LAPS-SLM2  is  applied. 

In  future,  I  will  examine  about  the  accurate  description  of  the  learning  and  experimental 
verification.  Moreover,  I  would  like  to  consider  about  the  negative  shift  of  the  weight 
matrix  by  the  effect  of  /  on  the  accumulation. 

5.  Conclusion 

The  sensitivity  transition  characteristics  of  the  LAPS-SLM  are  applied  to  an  optical  neural 
network  system.  In  this  system,  LAPS-SLMs  work  as  the  neural  memories  that  learn  on 
the  threshold  levels. 
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All-Optical  Dynamic  Memories 


M.P. Petrov 

A. F. Ioffe  Physical  Technical  Institute,  Russian  Academy  of 
Sciences^  St . Petersburg,  194021,  Russia 

Abstract.  The  principles  of  operation  of  an  all-optical  dynamic 
memory  (AODM)  are  considered.  A  considerable  progress  in  the 
development  of  optical  regenerators  which  are  the  key  elements  of 
AODM  is  emphasized.  The  analysis  of  tendencies  in  the  development 
of  AODM  and  numerical  estimates  have  shown  that  AODM  with  the 

information  capacity  of  more  than  10^  bits,  practically  unlimited 

-9 

storage  time,  bit  error  rate  less  than  10  ,  and  acceptable 
optical  power  consumption  can  be  built. 


1.  Introduction 

An  all-optical  dynamic  memory  (AODM)  is  designed  to  store  information  in 
the  form  of  optical  pulses.  Such  a  memory  inevitably  has  to  be  dynamic  but 
in  the  general  case  it  is  not  necessary  for  it  to  be  the  all-optical  one. 
However  when  the  pulse  duration  becomes  on  the  order  of  picoseconds  and  the 
repetition  rate  is  tens  of  gigahertz  the  all-optical  implementation  of  such 
a  memory  seems  the  only  possible.  An  advantage  of  a  very  wide  bandwidth  of 
optical  systems  can  be  efficiently  exploited  in  this  case. 

More  obvious  applications  of  AODM  are  in  the  area  of  digital  optical 
computers,  optical  telecommunication  systems,  coding  and  decoding  sytems, 
very  broad-bandwidth  signal  processing. 

The  information  capacity  (N)  of  AODM  is  determined  by  the  number  of 
pulses  which  are  placed  along  the  optical  path,  so  for  the  optical  path  of 
more  than  1  meter  it  is  reasonable  to  use  an  optical  fiber  as  a  stable 
medium  where  moving  optical  pulses  are  stored.  For  moderate  storage  time 

—  fi  — ? 

(10  -  10  s)  the  soliton  system  with  partial  compensation  of  losses  and 
reduced  jitter  in  pulse  arrival  times  can  be  efficiently  used  [1,2].  But  in 
the  case  of  long  (perhaps  practically  unlimited)  storage  time  the 

optical  pulses  have  to  propagate  at  long  distances  or  circulate  many  times 
around  the  fiber  loop,  running  many  millions  of  kilometers.  Then  to  ensure 

-9 

a  low  bit  error  rate  BER  (10  or  less)  AODM  has  to  be  capable  to  support  a 
high  signal-to-noise  ratio  and  to  restore  not  only  the  intensity  and 
shape,  but  also  the  exact  time  position  of  signal  pulses.  All  these 
characteristics  become  poorer  during  a  long  travel  of  pulses  because  of 
light  losses,  group  velocity  dispersion,  nonlinear  optical  phenomena  in 
fibers,  interaction  between  pulses  and  instabilities  of  various  components 
of  AODM.  The  problem  of  the  AODM  operation  stability  with  a  virtually 
unlimited  storage  time  is  very  similar  to  that  for  ultralong  optical  data 
transmission  systems  and  creates  the  same  questions  for  the  physics  and 
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communication  theory  concerning  information  entropy  increase. 


2.  Optical  regenerators 

To  restore  the  signal  pulse  characteristics  and  to  ensure  stability  of  the 
fiber  loop  which  is  a  system  with  feedback  the  optical  regenerator  that  is 
really  a  key  element  of  AODM  has  to  be  developed.  The  principles  of 
operation  of  optical  regenerators  are  very  close  to  those  of  all-optical 
switching  and  logic  elements  [3].  Such  nonlinear  phenomena  [4,5]  as  self¬ 
phase  modulation  (SPM),  cross  phase  modulation  (XPM),  optical  Kerr  effect, 
selfrotation  of  the  polarization  ellipse,  soliton  interaction  and 
stimulated  Raman  scattering  (SRS)  are  widely  used  In  manufacturing 
switching  and  logic  elements.  The  nonlinear  interferometer  [6],  nonlinear 
loop  mirror  (nonlinear  Sagnac  interferometer)  [7-10],  optical  SRS  invertor 
[11-13],  optical  switching  system  using  soliton  dragging  effect  [14],  Kerr 
switching  element  [15],  and  active  mode-locked  laser  using  XPM  [16]  are  the 
most  well-known  examples  of  such  devices.  Below  two  examples  of  all-optical 
regenerators  which  use  the  same  principles  as  logic  and  switching  elements 
are  presented. 

2.1.  Nonlinear  Sagnac  interferometer  (NSI) 

NSI  can  perform  self switching  [7,8]  or  switching  by  a  control  pulse  [9,10]. 
In  the  latter  case  timing  of  the  output  signal  can  be  provided.  The  typical 
scheme  of  NSI  (Fig.l)  of  the  second  type  consists  of  an  optical  coupler  (1) 
with  ports  1,2, 3, 4  and  coupling  ratio  50;50,  fiber  loop  and  additional 
coupler  (2)  with  port  5.  The  probe  optical  pulse  can  be  launched  into  port 
1  and  the  control  one  into  port  5.  The  wavelengths  of  control  and  probe 
signals  are  assumed  to  be  different.  If  there  is  no  control  pulse,  the 
probe  pulse  at  input  1  is  completely  reflected.  This  occurs  because  after 
the  input  signal  propagates  through  coupler  1  the  two  output  signals  in 
ports  2  and  3  have  equal  amplitudes  but  the  phase  difference  is  n/Z.  Then 


ii 

sc 


Fig.l  Diagram  of  a  nonlinear  Sagnac  interf erommeter . 
At  the  insert;  a  schematic  of  AODM 
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these  pulses  travel  through  the  loop  in  opposite  directions  and  pass 
through  the  coupler.  During  propagation  through  the  coupler  these  pulses 
acquire  an  additional  phase  difference  n/2  resulting  in  a  full  signal 
compensation  in  port  4  and  reflection  of  the  signal  in  port  1.  The 
situatuion  can  be  completely  changed  when  a  control  pulse  is  launched 
simultaneously  with  the  probe  pulse.  Control  pulse  moves  in  one  direction 
and  creates  a  nonlinear  phase  shift  for  the  probe  pulse  moving  in  the 

same  direction.  Neglecting  the  phase  shift  for  the  counterpropagat ing 
pulse,  we  can  consider  as  an  additional  phase  difference  between  the 

CO-  and  counterpropagat ing  pulses.  Then 


-20  2 

where  An,,„  =  2n^I  ,  n^  =  3.2*10  m  /W  [4],  I  is  the  intensity  of  the 

NL  2  c  2  c 

control  pulse  and  is  the  walk-off  length  or  distance  along  which  the 

probe  and  control  pulses  move  simultaneously  providing  overlap.  can  be 

less  than  the  loop  length  because  the  probe  and  control  pulses  move  with 
different  group  velocities  because  of  difference  between  the  wavelengths  of 
pulses  and  group  velocity  dispersion  of  the  fiber.  Under  the  condition  that 
^NL  ^  phase  relationship  between  pulses  results  in  switching 

optical  energy  into  port  4.  This  type  of  nonlinear  interferometers  exhibits 
very  good  characteristics:  a  low  control  pulse  intensity  because  of  a  long 
interaction  length  L^,  high  temperature  stability,  low  losses  and  provides 

an  efficient  way  of  timing  of  signals  by  control  pulses.  An  example  of  such 
type  of  NSI  used  for  regeneration  of  signals  is  described  in  [9] . 

In  the  experiments  the  output  pulse  duration  x  was  about  9  ps, 

repetition  rate  f  -  5  Gbit/s,  the  signal  pulse  peak  power  -  appr.  740  mW, 
^  -9 

switching  contrast  -  15  dB,  BER  =  10  .  The  loop  length  was  5  km.  An 
excellent  restoration  of  the  pulse  shape,  intensity  and  time  position  was 
obtained. 

The  authors  of  [9]  believe  that  the  regenerator  will  operate  well  at 
bit  rates  above  several  tens  of  Gbit/s.  Disadvantages  of  this  system  are 
degradation  of  characteristics  when  product  tends  to  unity  and 

operation  at  two  wavelengths,  but  the  latter  can  be  overcome  under 
definite  conditions  by  using  the  same  wavelength  but  two  different 
polarizations  for  clock  and  signal  pulses  if  a  birefringent  fibre  is  used 
[171. 


2.2.  SRS  regenerator 

In  [11-13]  an  all-optical  SRS  invertor  which  can  be  used  as  a  regenerator 
has  been  proposed  and  studied.  The  optical  invertor  converts  the  high-level 
input  signal  into  a  weak  output  signal  and  a  weak  input  signal  into  a  high- 
level  output  one.  The  scheme  of  an  SRS  invertor  (Fig.  2)  consists  of  the 
pump  source,  SRS  amplifier  (short  fibre),  rejection  filter  at  the  Stokes 
frequency  and  a  SRS  generator  (long  fibre).  The  SRS  phenomenon  can  briefly 
be  described  as  follows.  If  quite  a  strong  pump  pulse  is  launched  into  the 
fibre  the  amplification  or  generation  of  the  Stokes  pulse  occurs, The  Stokes 
signal  is  frequency  down  shifted  as  compared  with  the  pump  pulse.  The 
intensity  of  the  Stokes  signals  is 
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Fig.  2  Schematic  diagram  of  AODM  with  a  SRS  Invertor,  a)Connector  is  open. 
The  system  is  operating  as  an  isolated  SRS  invertor. 
b)Connector  is  closed.  The  system  is  operating  as  AODM. 


Ig(L)  =  lg(0)  exp  (gIpL)  (2) 

where  L  is  the  interaction  length  between  the  pump  and  Stokes  pulses,  1^  is 

the  pump  intensity,  g  =  0.9  10  cm/W  [4]  and  initial  Stokes 

intensity.  Here  we  neglect  for  simplicity  absorption  and  depletion  of  the 
pump  signal. 

When  there  is  no  input  Stokes  signal  is  determined  by  noise 

scattering,  and  the  fiber  can  serve  as  a  generator  of  the  Stokes  signal  if 
L  is  long  enough.  If  there  is  an  input  signal  1^(0)  the  fibre  operates  as  a 

Stokes  amplifier  [a  nonlinear  amplifier  in  the  general  case). 

Then  the  principle  of  operation  of  the  SRS  invertor  can  be  easily 
understood.  Let  a  signal  pulse  at  the  Stokes  frequency  be  applied  at  the 
input.  This  pulse  is  amplified  in  a  short  fibre  and  causes  pump  depletion. 
But  the  amplified  Stokes  pulse  is  rejected  by  the  filter  and  does  not  reach 
the  output.  So  there  is  no  Stokes  signal  at  the  output.  In  the  opposite 
case  if  there  is  no  Stokes  signal  at  the  input  a  powerful  pump  pulse 
propagates  through  the  first  (short)  fibre  undepleted  and  then  it  enters 
the  second  (long)  fibre  where  the  Stokes  signal  is  generated.  So  the  Stokes 
signal  appears  at  the  output.  If  this  process  is  repeated  twice  the  input 
pulse  is  regenerated.  The  SRS  regenerator  ensures  timing  by  synchronous 
pumping.  Restoration  of  the  intensity  ahd  stabilization  of  the  signal  shape 
are  due  to  periodical  generation  of  the  output  Stokes  pulses. 
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3.  Selected  schemes  of  AODM 

a)  Dynamic  optical  memory  with  electrooptic  regenerator  [18,19]. 

The  scheme  is  not  quite  all-optical  because  the  regenerator  includes  the 
electrically  controlled  optical  modulator  using  LiNbO^  crystal.  But  the 

characteristics  of  this  system  approach  those  of  the  all-optical  one.  The 

system  operates  like  an  actively  mode  locked  ring  fiber  soliton  laser. A 

one-million-kilometer  trip  of  optical  solitons  around  a  510-km  long  fiber 

loop  containing  ten  erbium-doped  fiber  amplifiers  has  been  demonstrated 

(storage  time  more  than  5  s).  The  repetition  rate  was  5-10  Gbit/s,  pulse 

duration  -  30-40  ps  and  peak  power  P  =  0.6  -  2  mW.  BER  was  less  than 

—9  ^ 

10  .  The  authors  [18,19]  pointed  out  that  the  electrooptical 

regenerator  ensures  excellent  suppression  of  jitter  and  restoration  of  the 

shape  of  solitons  in  spite  of  a  rather  short  distance  between  them.  So  the 

authors  claimed  that  solitons  can  be  transmitted  at  unlimited  distances. 

Unfortunately,  the  papers  [18,19]  do  not  contain  data  about  the  absolute 

stability  of  synchronization  between  reshaped  signals  and  clock  regenerator 

which  is  important  for  memory  applications.  Pj^tentially  the  system  with  the 

fibre  loop  of  510  km  can  contain  more  than  10  bits  of  information. 

b)  AODM  using  nonlinear  Sagnac  interferometer  [20] . 

This  scheme  exploits  the  signal  and  control  pulses  at  the  same  frequency 
but  with  orthogonal  polarizations.  If  the  output  signal  pulse  is  used  as  a 
control  one,  the  system  operates  as  an  Invertor  or  a  shift  register.  The 
repetition  rate  is  100  MHz,  pulse  duration  is  appr.  15  ps.  The  timing  is 
provided  by  the  external  Nd:YAG  pump  pulse  laser.  The  optical  losses  are 
compensated  for  by  an  erbium  doped  fiber  amplifier.  The  length  of  the  loop 
in  NSI  was  450  m,  so  the  memory  volume  was  254  pulses.  It  demonstrates  a 
stable  operation  for  hours  in  the  regime  of  optical  invertor. 

c)  AODM  with  SRS  invertor  [21,22]  (Fig. 2). 

This  example  of  AODM  is  one  of  the  most  stable  and  simple  because  a  SRS 
regenerator  used  in  the  scheme  is  only  slightly  affected  by  temperature  and 
other  outside  factors,  but  provides  a  high  signal-to-noise  ratio.  It  has 
experimentally  been  demonstrated  that  >  10  min  which  means  that  signals 

travel  for  more  than  100  million  kilometers  around  the  fibre  loop.  The 
repetition  rate  was  about  100  MHz,  pulse  duration  was  70-100  ps,  and  the 
number  of  circulating  pulses  was  about  600.  The  signal-to-noise  ratio  was 
more  than  50.  A  Nd:YAG  laser  at  A  =  1.06  pm  is  used  as  a  pump  source  and 
the  signal  wavelength  corresponded  to  the  Stokes  wavelength  (1.12  pm).  The 
system  exhibits  a  stable  operating  regime  on  variation  of  the  optical  path 
length  by  a  tunable  delay  line  in  the  interval  appr.  equal  to  the  pulse 
duration  as  well  as  in  case  of  the  pump  power  fluctuations. 


4.  Estimates  of  general  characteristics  of  AODM 


All  the  most  important  characteristics  of  AODM  such  as  N,  x^,  BER,  and 

average  signal  power  consumption  are  related  to  each  other.  Let  us 

estimate  roughly  some  characteristics  of  the  system  with  regenerator  based 

on  nonlinear  cross  phase  modulation  and  operating  with  soliton-like  pulses. 

P  and  N  can  be  estimated  from 
av 


P  =  P  X 
av  P  P 


f 

r 
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where 

velocity. 


is  the  total  length  of  a  fiber  memory  system  and  is 

BER  is  a  function  of  the  peak  pulse  power  P^.  has 


(3) 

the  group 
to  be  well 


below  1  W  because  of  the  problem  of  a  limited  dynamic  range  of  the  signal 

-1 

amplifier.  It  is  desirable  that  the  product  x^f^  be  on  the  order  of  10  or 

less  to  prevent  soliton-soliton  interaction  or  influence  of  counterpropaga- 
ting  stream  of  pulses  in  NSI.  So  the  reasonable  magnitude  of  P  is  much 


less  than  10  W.  It  can  be  made  low  enough  by  using  sufficiently  long  walk- 
off  length  L^.  Assuming  that  a  necessary  critical  value  of  equals  n  the 

relationship  for  P^  can  be  found  from  (1) 


^p  ^  ^eff 

where  S  is  the  effective  crosssection  of  the  fiber  core.  But  in  reality 
eff 

P  cannot  be  too  low  for  very  short  pulses  because  it  is  difficult  to  get 
P  .... 

very  long  L  and  because  a  low  BER  and  hence  a  high  signal/noise  ratio  is 

2 

required.  For  the  reasonable  value  of  ~  50  jum  ,  A  =  1.5  pm,  and 

500  m,  P  is  1.1  W.  Then  for  f  =  50  Gbit/s  and  N  =  10^,  the  total  fiber 
p  r 

length  is  appr.  4  km.  If  a  fiber  with  a  proper  group  velocity  dispersion 

is  used,  the  pulse  duration  can  be  as  short  as  1-3  ps.  BER  can  be  less  than 

lO”^  in  this  case.  For  much  higher  N  and  a  number  of  regenerators  can  be 

used.  These  estimates  are  very  rough,  but  more  precise  calculation  for 

dispersion-shifted  fibers  give  even  more  optimistic  parameters. 


5.  Conclusion 

Feasibility  of  AODM  using  nonlinear  optical  phenomena  in  fibers  has  been 

2-3 

experimentally  demonstrated  with  information  capacity  N  ~  10  and  storage 

time  T  >  10  min.  Efficient  optical  regenerators  have  been  developed,  so 
m 

improvement  of  AODM  characteristics  up  to  N  >10^,  practically  unlimited  x^, 

repetition  rate  10-50  Gbits”^  and  optical  energy  consumption  less  than  1  W 
at  a  low  bit  error  rate  seems  to  be  absolutely  realistic.  One  of  the  major 
problems  here  is  to  find  application  areas  where  AODMs  will  be  most 
efficiently  and  optimally  used. 

The  authors  are  thankful  to  Dr.  V. I . Belotitskii  and  Dr.  V.V.Spirin  for 
helpful  discussions. 
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Abstract.  Intense  photostimulated  luminescence  (PSL)  with  a  peak  at  420  nm  is  observed  in 
ultra— violet(UV)— light— irradiated  europium— doped  potassium  chloride  (KChEu)  crystalline 
phosphors.  The  PSL  characteristics  of  UV-irradiated  KCl:Eu  phosphor  for  optical  memory 
application  are  studied.  The  excitation  and  emission  mechanisms  of  the  420  nm  PSL,  which  are 
consistent  with  the  results  obtained,  are  discussed. 


1. Introduction 

A  new  type  of  optical  memory  based  on  the  photostimulated  luminescence  (PSL)  phe¬ 
nomenon  in  electron  trapping  phosphor  materials  for  optical  storage,  has  been  studied  in 
the  fields  of  optical  parallel  Boolean  logic  operations  [1],  optical  associative  memory  [2]  and 
optical  neural  networks  [3,4].  The  electron  trapping  phosphor  materials  can  emit  different 
output  photons  which  correlate  spatially  in  intensity  with  the  input  photons.  Consequently, 
the  electron  trapping  phosphor  materials  can  be  used  ,to  store  optical  information  as  trapped 
electrons  and  the  information  stored  can  be  read  out  by  a  laser  beam  scanning  of  the  phosphor 
materials  [4,5].  The  unique  features  of  the  electron  trapping  phosphor  materials  which  exhibit 
the  PSL  phenomenon  provide  the  potential  for  high  bit  storage  densities,  high  data  transfer 
and  fast  recovery  speeds[6].  Important  characteristics  of  good  electron  trapping  phosphor  ma¬ 
terials  for  optical  memory  are  high  PSL  brightness  for  low  noise,  short  luminescence  lifetime 
for  minimum  read  out  time  and  low  light  scattering  for  high  bit  storage  densities.  Especially, 
the  electron  trapping  phosphor  materials  using  transparent  single  crystals  or  thin  films  provide 
an  efficient  PSL  and  low  light  scattering. 

As  a  result  of  surveying  many  possible  transparent  alkali  halide  phosphors  in  order  to 
obtain  a  new  electron  trapping  phosphor  material  with  high  PSL  brightness  and  low  light  scat¬ 
tering,  we  found  that  transparent  Eu— doped  potassium  chloride  (KChEu)  crystalline  phos¬ 
phors  exhibit  an  efficient  PSL  for  optical  stimulation  with  visible  light  after  ultra-violet  (UV) 
light  irradiation  at  room  temperature  (RT).  In  this  paper,  we  report  the  PSL  characteristics 
of  KChEu  crystalline  phosphors  for  optical  memory  utilizing  the  PSL  phenomenon.  The  PSL 
excitation  and  emission  mechanisms  are  also  discussed. 


2.  Experimental 

Crystalline  phosphors  of  KChEu  used  in  the  present  study  were  grown  by  the  Bridgman 
method  from  a  molten  mixture  of  reagent  grade  KCl  and  EuCl3-6H20  (0.1  mol%).  The  Eu^"^ 
concentration  in  the  grown  crystals  was  determined  to  be  about  1.4x10^®  ions/cm^  from  the 
intensity  of  the  Eu^"*"  characteristic  optical  absorption  bands  at  about  243  nm  and  343  nm 
using  Smakula’s  formula.  The  concentration  of  Eu^+  ions  in  the  crystal  is  about  one  order 
of  magnitude  smaller  than  that  of  EuCla  in  the  original  mixture.  The  KChEu  samples  of 
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typically  5x5x1  mm®  in  size  were  kept  at  550  for  30  min  and  then  quenched  to  RT  to 
disperse  the  aggregation  of  Eu  impurity. 

The  UV— light  irradiation  for  optical  excitation  was  catrried  out  using  an  Hg— lamp. 
The  PSL  measurements  were  carried  out  using  a  Hitachi  F— 3010  spectrofluorometer  at  room 
temperature  (RT).  The  optical  absorption  spectra  were  measured  with  a  Hitachi  U— 2000 
spectrophotometer.  The  PSL  spectra  were  corrected  for  the  diffraction  of  the  grating  and 
optical  response  of  the  photomultiplier.  Details  of  the  system  to  measure  luminescence  have 
been  described  elsewhere  [7], 

3. Results  and  Discussion 

An  intense  PSL  peak  at  about  420  nm  was  observed  when  the  UV— light  irradiated 
specimen  was  stimulated  with  580  nm  light  at  RT.  A  typical  420  nm  PSL  emission  spectrum 
(solid  line)  as  well  as  stimulation  (read-out)  spectrum  (dashed  line)  for  the  420  nm  PSL  peak 
from  240  nm  UV-light  irradiated  KChEu  sample  is  shown  in  Fig.l.  Excitation  (write-in) 
spectrum  (dotted  line)  for  the  420  nm  PSL  peak  is  also  shown  in  Fig.l.  The  420  nm  PSL 
emission  is  assigned  to  an  inner  ionic  transition  (4f®5d  ->•  4^)  of  isolated  Eu^+  ions  which 
occupy  cation  sites  in  KCl  crystal,  since  photoluminescence  with  a  peak  at  about  420  nm 
was  observed  when  the  specimen  was  excited  with  243  nm  or  343  nm  light  .This  wavelength 
corresponds  to  that  of  the  characteristic  optical  absorption  band  maximum  of  isolated  Eu^+ 
ions  [8].  The  optical  absorption  spectra  before  (dashed  line)  and  after  (solid  line)  UV— light 
irradiation  are  also  shown  in  Fig. 2.  The  broad  optical  absorption  band  with  a  peak  at  about 
560—580  nm,  which  is  created  by  UV-light  irradiation  is  due  to  the  F  centers  [9]  and/or 
the  Zi  centers,  since  it  is  believed  that  alkali  halide  crystals  doped  with  Eu,  Sm,  Ca,  Ba  and 
Sr  etc.  (impurities  which  have  low  second  ionization  potential)  give  rise  to  Zi  centers  when 
F  centers  are  optically  bleached  at  RT  [10,11].  It  is  likely  that  the  F  centers  are  bleached 
during  UV-light  irradiation,  because  the  Hg— lamp  was  used  as  the  UV-light  source.  Since 
this  absorption  band  is  in  good  agreement  with  the  stimulation  spectrum  as  shown  in  Fig.l, 
one  can  point  out  that  there  is  a  close  relationship  between  the  420  nm  PSL  peak  and  the  F 
centers  and/or  Zi  centers.  The  excitation  spectrum  (dotted  line)  as  shown  in  Fig.l  agrees  with 
high-energy  absorption  band  of  Eu^"^  ions  as  shown  in  Fig.2.  This  result  suggests  that  the 
420  nm  PSL  is  only  excited  by  Eu®"^  high-energy  absorption  band.  It  has  been  reported  that 
Eu— doped  KCl  single  crystals  exhibited  a  photoconductivity  during  UV— light  irradiation  as 
shown  in  Fig. 3  [12].  The  photoconductivity  spectrum  as  shown  in  Fig.3  is  coincident  with  the 
excitation  spectrum  of  420  nm  PSL  peak  as  shown  in  Fig.l.  The  result  as  described  above  is, 
thus,  consistent  with  the  following  excitation  and  emission  mechanism  of  the  420  nm  PSL  peak. 

Exciaer  (IrF)  Laser  Ar*  Laser  Ee-He  Laser 


Fig.l  Typical  PSL  emission  spectrum  (solid  line)  when  UV-irradiated  KCl  phosphor  was  stimulated  with 
660  nm  light.  The  excitation  spectrum  (dotted  line)  and  stimulation  spectrum  (dashed  line)  for  the  420 
nm  PSL  peak  are  also  shown  in  the  figure. 
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Fig.2  Optical  absorption  spectra  before  (dashed  line) 
and  after  (solid  line)  UV-light  irradiation. 
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Fig.4  Possible  configrational  model  for  the  electron 
traps  in  UV— ray  irradiated  specimen. 


V.  B. 

Fig.3  Photoconductivity  spectrum  [12]  of  KCl:Eu 
phosphor  measured  by  UV— ray  irradiation 
and 

energy  band  diagram  for  excitation  mecha¬ 
nism  in  UV-ray  irradiated  specimen. 


Due  to  the  overlap  between  the  Eu^+  high-energy  absorption  band  and  the  KCl  induction 
band,  Eu^+  photoionization  and  subsequent  free  electrons  occur  during  UV-light  irradiaUon 
as  schematically  shown  in  Fig.3.  It  is,  therefore,  likely  that  some  of  the  electrons  excited 
from  Eu^+  ions  to  the  conduction  band  by  UV  irradiation  are  trapped  at  anion  vacancies 
neighboring  Eu^+  ions  to  produce  the  F  centers  strongly  perturbed  by  neighboring  positive 
ion  vacancies  and  Eu^+  ions  as  shown  in  Fig.4,  like  Zi  centers.  By  subsequent  stimulation  with 
560-580  nm  light,  electrons  which  are  opticaJly  released 'from  the  complex  centers  recombine 
with  the  Eu^+  ions  leading  to  the  excited  Eu^+  ions  from  which  the  420  nm  PSL  is  emitted. 
Obviously,  transparent  KCl:Eu  crystalline  phosphors  can  be  used  to  store  optical  information 
using  the  UV-light  such  as  a  KrF  laser  (248  nm)  and  the  information  can  be  read  out  using 
visible  light  ,a  He-Ne  laser  or  Ar+  ion  laser,  because  the  wavelengths  of  peak  maximum  of 
excitation  and  stimulation  spectrum  are  240  nm  and  580  nm,  respectively,  as  shown  in  Fig.l. 
In  addition  to  storage,  the  KChEu  phosphor  is  a  subject  of  interest  as  hardware  devices  capable 
of  performing  multiplication,  addition  and  subtraction  within  a  dynamic  range  covering  a  few 
orders  of  magnitude,  since  the  420  nm  PSL  emission  intensity  is  proportional  to  the  UV-light 
irradiation  dose  as  shown  in  Fig.5.  This  result  also  suggests  that  the  PSL  phenomenon  in 
KChEu  phosphors  is  applicable  to  an  analog  memory  device  for  a  learning  optical  neural 
network. 

The  lifetime  of  the  420  nm  PSL  was  about  1,6  //sec.  This  lifetime  is  fairly  good  for 
high  speed  operation  of  the  optical  memory.  We  confirmed  that  the  information  stored  in 
UV-light  irradiated  specimen  exhibited  excellent  fading  characteristics  at  RT  in  the  dark. 
Figure  6  shows  the  fading  characteristics  of  the  PSL  intensity  in  UV— light  irradiated  sample 
at  RT  in  the  dark.  It  can  be  seen  that  the  PSL  intensity  decreases  with  increasing  the  passage 
time  in  the  early  stage  and  then  saturates  to  a  constant  PSL  intensity.  One  can,  thus,  expect 
that  stored  UV  write-in  information  is  fairly  stable  at  RT  in  the  dark. 
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UV  IRRADIATION  TIME  Cs@c)  Fig,6  Fading  characteristics  of  the  PS  L  intensity  in 

Fig.5  PSL  intensity  as  a  function  of  UV  write-in  UV-  irradiated  KCl:Eu  crystal  at  RT  in  the 

intensity  in  KCl:Eu  crystal.  dark, 

4. Summary 

We  have  observed  an  intense  PSL  with  a  peak  at  about  420  nm  when  UV-light  irradiated 
KChEu  crystalline  phosphors  are  stimulated  with  560-580  nm  light.  The  PSL  intensity  is 
proportional  to  the  UV  irrsidiation  dose.  The  KChEu  phosphor  as  the  electron  trapping 
phosphor  material  is  one  of  the  most  attractive  candidates  for  erasable  and  rewritable  optical 
memory  utilizing  the  PSL  phenomenon,  since  excimer  lasers  such  as  KrF  (248  nm)  and  KrCl 
(222  nm)  lasers  and  visible-emitting  lasers  such  as  He-Ne  and  Ar+  ion  lasers  can  be  used  as 
write-in  and  read-out  light  sources,  respectively. 
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Abstract.  The  nonlinear  optical  properties  of  a  hybrid  AlAs/GaAs  multi¬ 
layered  heterostructure  are  presented.  Experimentally,  optical  bistability  at 
optical  intensities  as  low  as  30  mWcm"^  is  obtained  when  a  bias  voltage  is 
applied  perpendicular  to  the  layers. 


1.  Introduction 

In  1991  He  and  Cada  [1]  predicted  optical  bistability  in  a  multilayered  periodical 
AlAs/GaAs  structure  at  optical  intensity  levels  of  10  kWcm-T  In  the  following  year  Cada 
et  al  [2]  achieved  the  experimental  proof  of  this  prediction  in  a  30  period  GaAs/AlAs 
structure  and  in  1993  He  et  al  [3]  reported  switching  times  <  50  ps  for  optical  bistabil¬ 
ity  in  a  heterostructure.  Establishing  a  dc  electric  field  perpendicular  to  the  layers  the 
authors  achieved  electrically  induced  optical  bistability  in  a  similar  multilayered  struc¬ 
ture  and  the  threshold  intensity  for  the  bistability  has  been  lowered  by  three  orders 
of  magnitude  down  to  1.4  Wcm“^  [4].  A  phenomenological  model  for  this  hybrid  case 
was  presented  proposing  two  states  of  the  device,  charge  transport  and  charge  storage 
state  [5].  Formulating  equations  for  this  electrically  induced  optical  bistability,  Ivanov 
and  Haug  [6]  determined  optical  switching  times  for  this  bistability  <  100  ns  leading  to 
switching  energy  densities  <  1  fJ/mi~^. 


2.  Experimental  results 

In  this  paper,  a  20  period  AlAs/GaAs  multilayered  structure  is  investigated  (Fig.  1). 
The  layers  are  grown  by  molecular  beam  epitaxy  (MBE)  on  an  i-GaAs  substrate  with 
nominal  thicknesses  of  58  nm  for  GaAs  and  69  nrn  for  AlAs,  respectively.  Gold  contacts 
are  evaporated  on  the  top  and  the  bottom  to  establish  a  dc  electric  field  in  the  structure. 
The  load  resistance  can  be  magnified  with  the  external  resistor  Rexi- 

First,  as  an  electro-optical  characterization,  the  reflection  spectra  of  the  structure 
with  and  without  an  applied  voltage  V  are  shown  at  a  temperature  of  —  13.5  °C  in 


566 


Fig.  1:  Sketch  of  the  hybrid  multilayered  Fig.  2:  Reflection  spectra  measured  with 
structure.  and  without  an  impressed  voltage. 

Fig.  2.  It  is  remarkable,  that  for  wavelengths  A  =  868  nm  to  874  nm  the  reflection  71 
is  lowered  and  for  A  —  874  nm  to  878  nm  IZ  is  magnified  by  an  applied  electrical  field 
perpendicular  to  the  layers. 

In  a  further  experiment,  1Z  is  measured  versus  the  incident  optical  intensity  at 
A  =  875  nm  and  an  impressed  voltage  V  =  60  V  (Fig.  3).  Starting  at  low  intensities,  first 
7Z  decreases  with  increasing  bn.  Moreover,  at  a  threshold  intensity  Rh  =  96  mWcm~^ 
the  reflection  switches  to  a  higher  state.  As  can  be  seen  from  Fig.  3,  a  switching  contrast 
as  high  as  5.0  dB  is  achieved.  Beyond  that  point,  7Z  is  constant  with  increasing  bn- 
Lowering  bn  again,  1Z  remains  almost  constant  at  the  higher  level  and  is  switching  down 
to  the  lower  state  at  bn  =  12  mWcm“^.  Hence,  a  hysteresis  loop  is  formed. 

In  another  experiment  the  electro-optical  modulator  properties  of  the  device 
are  compared  with  the  properties  under  conditions  for  optical  bistability.  The  solid 
line  in  Fig.  4  shows  the  contrast  ratio  of  the  electro-optical  modulation,  Kmod  = 
10  ♦  lg[7^(60V)/'^(0V)],  as  a  function  of  A  at  bn  =  150  mWcm”^.  In  agreement  with 
Fig.  2  one  can  find  a  change  of  the  sign  of  Kmod-  Kmod  has  a  negative  value  for  A  = 
868  nm  to  874  nm  and  a  positive  one  for  A  =  874  nm  to  878  nm,  respectively.  As  shown 
in  Fig.  4,  Kmod  reaches  the  peak  value  of  5.9  dB  at  A  =  875.3  nm.  The  crosses  in  Fig.  4 
represent  the  switching  contrast  at  bh  with  V  =  60  V  for  different  wavelengths.  In  the 
wavelength  range  of  A  =  874  nm  to  878  nm  upswitching  of  7Z  occurs  as  sketched  in  Fig.  3. 
In  this  region  the  counter  clockwise  hysteresis  loops  give  a  positive  value  of  the  switch¬ 
ing  contrast.  In  contrast  to  this  counter  clockwise  hysteresis,  clockwise  hysteresis  loops 
occur  in  the  range  of  A  =  868  nm  to  874  nm.  In  this  case,  starting  at  low  intensities, 
7Z  increases  with  increasing  bn-  At  Ri,  7Z  switches  down  to  a  lower  state.  Beyond  this 
point  of  bn,  7Z  stays  almost  constant.  Lowering  bn  7Z.  switches  back  to  the  first  state  at 
bn  Rh- 

Using  a  second  device,  Rh  is  determined  as  a  function  of  the  voltage  V  at  room 
temperature.  In  Fig.  5  the  contrast  ratio  for  upswitching  of  7Z  at  Rh  is  shown  in  the 
voltage  range  of  V  =  40  V  to  1 15  V.  .As  can  be  seen,  Rh  decreases  with  increasing  voltage. 
As  a  key  result,  b.h  is  lowered  down  to  30  mWcrn"^. 

In  a  last  experiment  the  influence  of  the  load  resistance  on  the  optical  bistability 
is  observed  using  tlie  second  device.  Increasing  the  load  resistance  by  the  additional, 
external  resistor  Rcxt,  the  contrast  ratio  at  b.ii  i-s  decreasing  (Fig.  6).  It  is  remarkable 
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Input  optica!  intensity  in  mW/cnn^ 
Fig.  3:  Optical  bistability  measured  at  A  = 
875  nm. 


Fig.  4:  Contrast  ratio  of  electro-optical 
modulation  within  a  voltage  swing  of  60  V 
(solid  line)  and  contrast  ratio  of  optical 
bistability  at  bh  with  V  =  60  V  (crosses). 
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Fig.  5:  Threshold  intensity  bh  for  the  op-  Fig.  6:  Contrast  ratio  of  optical  bistability 
tical  bistability  versus  the  applied  voltage,  as  a  function  of  the  external  resistor 


that  the  peak  value  of  the  optical  switching  contrast  is  achieved  at  Rext  —  0- 


3.  Discussion 

Our  measurements  show  electro-optical  modulation  and  electrically  induced  optical  bista¬ 
bility  using  an  AlAs/GaAs  multilayered  structure.  As  depicted  in  Fig.  3  optical  bistabil¬ 
ity  with  a  switching  contrast  of  5  dB  is  realized  at  A  =  875  nm  and  compared  with  the 
results  in  [4],  the  nonlinearity  of  the  structure  is  extremely  enlarged  and  the  switching 
contrast  is  enhanced  by  more  than  a  factor  of  5. 

In  Fig.  4,  the  contrast  ratio  of  optical  bistability  is  identical  to  the  behaviour 
of  Kmod(A).  Hence,  the  switching  contrast  of  the  bistability  and  the  electro-optical 
modulation  are  in  coincidence.  So  for  a  given  optically  induced  change  of  the  optical 
parameters  the  switching  contrast  can  easily  be  enlarged  by  a  modified  structure,  e.  g. 
a  distributed-feedback/Fabry-Perot  etalon  configuration. 

As  can  be  seen  from  Fig.  5,  the  optical  threshold  intensity  is  lowered  by  increasing 
the  applied  voltage  perpendicular  to  the  layers.  Hence,  the  responsivity  of  the  structure 
can  be  improved  by  an  increasing  electrical  field  in  the  structure.  As  a  consequence,  for 
the  first  time  optical  bistability  is  achieved  at  an  optical  threshold  intensity  as  low  as 
30  mWcm"^  in  such  a  multilayered  structure.  Lowering  the  spot  diameter  from  60  fxm 


568 


down  to  20  /im  an  optical  threshold  power  of  about  100  nW  is  expected  for  bistability. 
Therefore  an  array  of  100  *  100  elements  can  easily  be  switched  using  a  laser  diode  with 
an  optical  power  <  10  mW. 

In  Fig.  6  one  can  see,  that  the  optical  switching  contrast  ratio  is  lowered  by  an  ad¬ 
ditional  external  resistor  Rest-  As  a  key  result,  the  contrast  ratio  of  the  optical  bistability 
could  be  enhanced  by  a  decreasing  load  resistance.  Furthermore,  a  lowering  of  the  load 
resistance  decreases  the  RC  time  constant  of  the  device.  As  mentioned  by  Ivanov  and 
Haug  [6]  this  RC  time  constant  is  the  limiting  factor  for  the  dynamical  properties  of  the 
device.  As  a  consequence,  a  decrease  of  the  internal  load  resistance  will  most  probably 
improve  the  optical  switching  contrast  and  the  dynamical  behaviour  of  our  device. 


4.  Conclusion 

In  summary,  extremely  enlarged  nonlinear  optical  properties  and  highest  optical  respon- 
sivity  are  measured  at  a  hybrid  multilayered  heterostructure.  These  fantastic  properties 
of  this  nonlinear  medium  are  leading  to  various  potential  applications  for  low  intensity 
digital  optical  computing  devices. 
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Abstract.  Compressive  strain  in  InGaAs/Ga(Al)As  quantum  wells  leads  to  a 
reduction  in  the  saturation  carrier  density  for  the  bandedge  exciton  due  to  a 
decrease  in  the  hole  density  of  states.  An  Indium  concentration  of  1 1%  gives  a 
reduction  by  a  factor  of  9  from  GaAs  thereby  opening  a  new  path  towards 
viable  nonlinear  optical  devices.  The  relevant  issues  of  carrier  lifetimes  and 
strain  relief  are  discussed. 


The  concept  of  all-optical  computing  based  on  nonlinear  optical  devices  has  been  in  vogue  for 
several  years.  Considerable  effort  has  been  spent  on  developing  suitable  architectures  and 
materials  with  sufficient  nonlinearity  to  be  useful  in  device  arrays.  Unfortunately  to  date  even 
the  strongest  known  nonlinearity  is  still  insufficient  to  be  useful.  One  of  the  most  promising 
materials  is  a  direct  gap  quantum  well  such  as  the  prototypical  GaAs/GaAlAs  structure  with  its 
strong  saturable  exciton  resonance  at  the  bandedge.  Saturation  of  the  exciton  absorption 
through  band  filling  effects  at  room  temperature  [1]  is  a  function  of  the  densities  of  states  of  the 
electrons  and  holes  making  up  the  exciton.  Any  reduction  in  the  density  of  states  would  lead  to 
more  effective  bandfilling  for  fixed  earner  concentration  and  hence  to  a  lower  saturation  density 
and  possibly  lower  saturation  intensity,  the  latter  being  most  important  for  devices. 

We  have  shown  that  in  an  InGaAs/GaAs  quantum  well  Fabry-Perot  device  stimcture  [2] 
compressive  strain  in  the  InGaAs  wells  gives  a  factor  of  2  reduction  in  saturation  carrier 
density.  Jin  et  al  [3]  have  also  demonstrated  a  reduction  in  a  well  structure  containing  15% 
Indium.  More  recently  [4],  we  have  shown  that  with  proper  control  of  strain,  the  saturation 
density  can  be  reduced  by  up  to  an  order  of  magnitude  compared  to  GaAs  wells.  This  occurs 
at  an  In  concentration  of  about  10%  and  indicates  that  further  increase  in  In  concentration 
without  strain  relief  could  lead  to  further  reduction  in  saturation  values.  This  represents  a 
significant  step  towards  a  set  of  materials  with  sufficient  nonlinearity  for  useful  devices.  In 
this  paper,  we  summarise  these  key  results  and  discuss  the  problems  that  need  to  be  resolved 
before  the  effects  can  be  used.  In  particular,  carrier  lifetimes  in  Indium  doped  GaAs  are  short 
[5]  for  reasons  not  quite  understood.  Since  cairicr  lifetimes  are  directly  involved  in  saturation 
intensitie,s,  application  to  nonlinear  devices  will  require  this  material  parameter  to  be  optimised. 

Samples  of  luxGai.xAs/GaAlAs  quantum  wells  were  grown  by  metal  organic  vapour 
phase  epitaxy  (MOVPE)  [6],  with  x  ranging  from  0  to  0. 15.  Low  temperature  luminescence 
and  absorption  measurements  together  with  the  known  well  widths  gave  accurate  values  for  the 
Indium  concentrations.  A  series  of  samples  with  x  =  0,  0.03,  0.1 1  and  0.15  labelled  SI  to  S4, 
respectively  in  the  figures  were  used  for  detailed  measurements.  Details  of  the  samples  are 
given  in  Table  1.  Samples  with  x  =  0.03  and  0. 1 1  were  fully  strained.  The  x  =  0.15  sample 
was  partially  strain  relieved,  as  shown  by  cross-hatching  under  a  Nomarskii  microscope.  Even 
though  the  GaAs  substrate  has  a  larger  bandgap  than  the  InGaAs  wells,  bandtailing  in  the 
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substrate  prevented  nonlinear  measurements  in  transmission  for  low  Indium  concentrations  so 
the  substrate  was  etched  away  in  all  samples.  All  measurements  were  made  at  room 
temperature. 


Table  1 


Sample  Code 

Indium 

Concentration 

No.  of 
Periods 

Well/Barrier 
Width  (nm) 

Barrier 

Composition 

SI 

0.00 

10 

8/8 

Alo.2Gao.8As 

S2 

0.03 

10 

8/8 

Alo.2Gao.8As 

S3 

0.11 

10 

8/8 

Alo.2Gao.8As 

S4 

0.15 

10 

8/8 

Alo.2Gao.8As 

Details  of  InGaAs/AlGaAs  samples. 


The  saturation  carrier  density  Nsat  the  lowest  energy  exciton  resonance  was 
calculated  from  the  dependence  of  the  peak  excitonic  absorption  coefficient  a(N)  on  incident 

/XT^  t^O 

resonant  light  intensity  I  according  to  the  empiiical  saturation  equation:  a(N)  = 

where  N  is  the  carrier  density  and  tto  is  the  saturable  low  intensity  absorption  coefficient. 
a(N)  is  measured  from  the  intensity  dependent  transmission  spectra  near  the  exciton  resonance 
using  the  technique  described  in  [4J.  The  cairier  density  N  is  detennined  from  N  =  a  (N)  Ii/hv 
where  x  is  the  carrier  lifetime,  hv  is  the  photon  energy  and  I  is  the  incident  intensity.  The 
lifetime  x  is  measured  by  the  standard  pump/probe  technique  over  the  intensity  range  used  to 
measure  a(I). 


Figure  1  PiiiTip/probe  decays  ai  jx'ak  cif  heavy  hole  exciton  in  S 1  (CniAs)  and  S2  (3%  In). 

Figure  1  shows  decays  of  the  probe  signal  for  the  case  of  x  =  0  (SI)  and  x  =  0.03  (S2) 
as  a  function  of  delay  after  the  pump  where  both  pump  and  probe  are  resonant  with  the  lowest 
energy  InGaAs  exciton.  The  carrier  lifetime  is  4  nsec  for  the  GaAs  well  and  only  420  psec  for 
the  3%  In  case.  In  fact,  all  of  the  InGaAs  wells  gave  approximately  the  same  lifetime  [5] 
indicating  low  quantum  efficiency.  This  is  attributed  to  cither  thermal  activation  over  the 
bariiers  or  quenching  due  to  delects  at  the  GaAlAs  hairier  interlace.  With  the  known  liletimes, 
we  can  now  calculate  N^^t  Itom  the  following  measurements. 
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Figure  2(a)  shows  the  absorption  spectra  as  a  function  of  I  for  the  3%  Indium  sample 
(S2).  The  heavy  hole  resonance  at  854  nm  is  clearly  seen  and  its  absorption  decreases  strongly 
for  input  intensities  up  to  14  kW  cm’^.  The  variation  of  the  peak  absorption  coefficient  with  N 
is  shown  in  Figure  2(b)  for  two  Indium  concentrations,  0.03  and  0.1 1,  respectively.  It  is  clear 
that  saturation  of  the  exciton  occurs  at  much  lower  carrier  densities  for  the  higher  Indium 
concentration.  We  calculate  Nsat  to  be  7  x  lOl"^  and  1  x  10^^  cm-3  for  x  =  0.03  and  0.11, 
respectively.  Figure  3  shows  how  varies  with  Indium  concentration  for  all  four  samples. 


Figure  2  (a)  Absorption  spectra  for  S2  (3%  In)  for  I  =  0.009,  2.23,  9..^  and  14  kW  cm'^.  (b)  Peak 

exciton  absoiption  coefficient  as  a  function  of  earner  density  for  S2  (o,  3%  In)  and  S3 
(□,  11%  In),  normalised  to  each  other. 


It  is  clear  that  increasing  the  Indium  content  leads  to  a  very  significant  decrease  in  Nsat. 
almost  an  order  of  magnitude  for  x  =  0.1 1  compared  to  GaAs  (x=0).  The  fact  that 
improvements  in  Nsat  <^f  this  size  can  be  achieved  by  merely  engineering  the  composition  is  a 
hopeful  sign  that  all-optical  nonlinearities  may  be  eventually  useful.  The  decrease  in  Nsat  has 
been  correlated  with  the  reduction  in  the  density  of  hole  states  [4]  brought  about  by  a  decrease 
of  the  effective  mass  near  k=0  due  to  compressive  strain  [7]. 


In  Figure  3,  the  greatest  change  in  Nsat  hs  between  x  =  0.03  and  0. 1 1.  Sample  S4  with 
X  =  0.15  is  similar  to  S3  and  this  can  be  explained  by  the  presence  of  observed  strain  relaxation 
in  this  sample.  Also  included  in  Figure  3  are  two  other  data  points,  one  for  an  InGaAs 
asymmetric  Fabry-Perot  modulator  structure  (AFPM)  [2]  with  a  small  reduction  in  Nsat.  even 
though  X  =  0.1.  This  can  also  be  explained  by  the  fact  that  this  structure  has  a  much  greater 
number  of  periods  than  S 1-S4  and  both  well  and  hairier  layer  widths  are  much  greater  than  the 
layer  widths  of  SI  -  S4.  Consequently,  this  structure  is  much  more  sensitive  to  strain 
relaxation  which  does  indeed  occur.  Finally,  the  work  of  Jin  et  al  [3]  is  included  in  which  x  = 
0.15,  similar  to  S4.  In  their  case,  Nsat  had  a  much  higher  value  than  in  S4,  again  possibly  due 
to  strain  relief.  These  results  point  to  the  importance  of  careful  control  of  strain.  Control 
becomes  more  critical  as  the  indium  content  increases  since  the  critical  thickness  before  the 
onset  of  sti'ain  relief  also  decreases. 


A  reduction  in  the  hole  density  of  states  should  also  reduce  the  absoiption  coefficient  at 
the  bandedge.  In  devices  requiring  a  certain  amount  of  absorption  this  could  be  compensated 
for  by  an  increase  in  the  number  of  layers.  Becau.se  there  is  a  critical  thickness  for  strained 
structures,  this  is  an  undesirable  feature.  However,  no  evident  reduction  in  the  exciton 
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absorption  strength  was  observed  up  to  x  =  0.15  showing  that  it  is  possible  to  get  significant 
reduction  in  N^at  with  little  reduction  in  the  excitonic  value  of  (Xq. 


Figure  3  l^loi  of  saunaiioa  cairier  density  against  indium  concentration  lor  all  measured  samples. 

To  translate  the  effect  into  device  usefulness  the  saturation  intensity  I^at  must  also  be 
similarly  reduced.  Since  l^ai  scales  linearly  with  x  then  it  is  necessary  to  be  able  to 

independently  control  x  and  eliminate  any  nonradiative  decay.  Lifetimes  in  these  InGaAs  wells 
are  much  shorter  than  in  GaAs  and  to  date  this  seems  to  be  a  general  problem  [8]  due  to 
defects,  low  barrier  heights  or  some  other  as  yet  unknown  factor.  It  should  be  pointed  out  that 
these  lifetimes  are  measured  at  densities  much  lower  than  typical  in  lasers  where  defects  may  be 
saturated,  recovenng  high  quantum  efficiency. 

In  summary,  this  paper  has  outlined  a  path  towards  enhancing  resonant  optical 
nonlinearities  by  compressive  strain  in  InGaAs/GaAlAs  quantum  wells.  An  order  of 
magnitude  reduction  in  saturation  cairier  density  has  already  been  observed  with  11%  Indium. 
This  indicates  several  avenues  for  further  research  including  control  of  earner  lifetimes  to  also 
reduce  saturation  intensities,  investigation  of  structures  with  higher  Indium  content,  control  of 
strain  relief,  and  other  strained  systems  with  similar  bandgap  curvature  engineering 
possibilities. 
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Abstract.  We  report  on  experimental  results  on  the  dynamical  behaviour 
of  n-i-p-i  based  smart  pixels,  composed  of  photoconductive  switches  and 
electroabsorptive  n-i-p-i  modulators.  For  the  photoconductive  switch  we 
present  switching  times  of  1.9  ns  at  an  optical  power  of  880  pW,  corres¬ 
ponding  to  a  switching  energy  of  1.7  pJ.  The  contrast  of  the  electronic 
output  signal  is  larger  than  lO”^  and  a  maximum  dc  gain  exceeding  10^ 
is  achieved.  For  the  opto-optical  switching  contrast  ratios  of  4:1  at  1.6  mW 
output  power  are  shown  with  switching  energies  of  2.4fJ/pm^  (1.7  pj). 
The  opto-optical  gain  is  tunable  from  10-10^. 


1.  Introduction 


Recently  we  have  demonstrated  a  new  smart  pixel  concept  [l]  composing  of  a  photo¬ 
conductive  switch  with  high  electrical  gain  [2]  and  a  high  contrast  electro-optical 
n-i-p-i  modulator  [3].  A  schematic  picture  of  our  smart  pixel  concept  is  shown  in 
figure  1.  The  switch  is  connected  to  an  electro-optical  modulator,  whose  output 
changes  between  its  'on  -  and  'off'-value  depending  on  whether  the  switch  is  'open'  or 
'closed'.  One  advantage  cf  such  a  hybrid  concept  is  that  both  elements  (switch  and 
modulator)  can  be  optimized  separately.  Therefore  the  switches  can  have  high  res- 
ponsivity  and  high  photoconductive  gain  and  the  modulators  can  have  high  contrast 
and  low  insertion  loss.  Due  to  the  high  photoconductive  gain  of  the  switch  only  a 
small  optical  input  is  necessary  to  control  the  much  larger  output  power  of  the 
n-i-p-i  modulator.  So  pure  opto-optical  logic  devices  with  high  optical  gain  are 
realized  requiring  only  a  dc  voltage  for  operation. 


reference  n-i-p  n-i-p-i 

diode  switch  modulator 


Fig.  1  :  Schematic  diagram  depicting  our 
smart  pixel  concept.  The  smart  pixel 
consists  of  an  opto-electrical  switch, 
a  reference  diode  and  a  high  contrast 
electro-optical  n-i-p-i  modulator.  In 
this  circuit  the  large  modulator  output 
power  P^^^  is  controlled  by  a  small 
input  power  P^^  incident  onto  the  switch. 
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2.  Opto-electfical  switching 

The  photoconductive  switch  is  basically  a  pin-FET,  integrated  into  a  single  device. 
The  goal  of  designing  this  switch  is  to  obtain  the  largest  possible  photoconductive 
response  in  an  n-channel  with  a  minimum  number  of  photons  incident  onto  the  switch. 
Apart  from  optimizing  the  quantum  efficiency  of  the  pin  diode  this  goal  can  be 
achieved  by  minimizing  the  capacitance  of  the  switch.  Therefore  we  use  a  sophisticated 
sample  design  with  a  small  detection  area  which  surrounds  a  spatially  separated  large 
absorption  area.  Figure  2  shows  a  SEM  picture  of  the  switch  which  has  been  mono- 
lithically  integrated  with  the  reference  diode.  The  large  absorption  area  is  always 
depleted  and  thus  doesn't  contribute  to  the  device  capacitance  [4].  Due  to  the  'giant 
ambipolar  diffusion'  constant  of  such  structures  [5],  the  diffusion  of  the  photo-generated 
carriers  from  the  inner  absorption  -  to  the  surrounding  detection  area  is  fast  enough 
(e.g.  ijiff<50ps,  if  the  diameter  of  the  absorption  area  is  less  than  20  (im).  In 
figure  3  experimental  results  on  the  dynamical  switching  behaviour  of  the  photo¬ 
conductive  switch  are  shown.  With  an  optical  power  of  880  pW  the  switching 

time  of  the  n-layer  current  is  1.9  ns.  This  corresponds  to  a  switching  energy  of  1.7  pJ 
(=  2.4  fJ/pm^  referring  to  the  area  of  this  device  (0=30  pm)  ).  For  this  low  switching 
time  the  opto-electrical  gain  is  still  40.  Figure  4  shows  the  dc-switching  behaviour 
of  the  n-layer  current  vs.  the  optical  power  at  the  photoconductive  switch. 

The  curves  correspond  to  different  optical  power  levels  Pref  incident  onto  the 
reference  diode,  shifting  the  switching  point  along  the  power  axis.  For  the  smallest 
reference  power  the  contrast  between  the  'on'  and  'off'-values  of  the  n-layer  current 
is  larger  than  10^,  corresponding  to  a  high  photoconductive  gain  of  10^. 


Fig.  2:  SEM  picture  of  a  monolithically 
integrated  electro-optical  switch.  The 
total  diameter  of  the  ring  structure  is 
30  pm.  The  two  metal  rings  are  2  pm 
thick  and  separated  by  1  pm. 

3.  Opto-optical  switching 

By  combining  the  photoconductive  switch  with  a  high  contrast  n-i-p-i  modulator,  as 
described  in  Ref.  [3],  we  obtained  opto-optical  switching.  Depending  on  whether  the 
switch  is  in  the  high  or  low  resistance  state  the  voltage  (see  figure  l)  drops 

either  across  the  switch  or  the  modulator.  If  the  voltage  drops  across  the  switch, 
the  modulator  is  in  its  transparent  state  and  then  the  optical  output  signal  P^^^  is 
high.  Our  hybrid  smart  pixel  concept  allows  for  opto-optical  switching  with  high 
optical  gain  and  low  optical  switching  energies.  It  is  possible  to  achieve  various 
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Time  (ns) 


Fig.  3:  Dynamical  switching  behaviour 
in  the  n-layer  current  1^^^  of  the  photo- 
conductive  switch.  In  this  case  I^n  is 
switched  on  and  off  by  pulses  of  the 
same  optical  power  =  Pj-ef  “ 

The  switching  time  is  1.9  ns, 

corresponding  to  a  switching  energy  of 
1.7  pJ  for  this  0  =  30  [xm  device. 


10“®  10“®  10~’^  10"®  10  ®  10  ^ 
Optical  Power  (W) 


Fig.  4  ;  Double  logarithmic  plot  of  the 
n-layer  current  vs.  the  optical  power 
at  Pg^  for  different  values  of  Pj-^f 
If  the  optical  power  at  the  switch  P^^ 
exceeds  the  reference  power  P^^j.  at  the 
reference  diode,  the  voltage  distribution 
between  switch  and  reference  diode 
changes  and  simultanously  the  n-layer 
current  of  the  switch  rises  from  its 
depletion  value  up  into  the  mA  range. 


logical  functions.  Here  we  show  results  of  a  NOR  gate.  Figure  5  shows  switching 
behaviour  of  the  optical  output  P^^^  of  the  n-i-p-i  modulator  vs.  the  optical  intput 
power  P^^  incident  onto  the  switch.  In  this  case  the  optical  output  signal  changes 
from  400  to  1.6  mW  corresponding  to  an  on/off  ratio  of  4:1.  As  shown  in  figure  5 
the  switching  point  can  be  adjusted  by  changing  the  reference  power  P^ef  Depending 
on  the  switching  power  an  opto-optical  gain  ranging  from  10  to  about  10^  has  been 
achieved.  Figure  6  shows  the  dynamical  behaviour  of  the  smart  pixel.  With  pulses 
of  P^^  =  Pj.^j.=  2tiW  incident  either  onto  the  switch  or  reference  diode  the  optical 
output  can  be  switched  'on'  or  'off.  The  corresponding  switching  time  is  Ips  leading 
to  a  switching  energie  of  2pJ.  The  optical  gain  in  this  case  is  750. 


4.  Conclusions 

We  have  demonstrated  low  optical  switching  energies  using  a  new  n-i-p-i  based 
smart  pixel  concept.  Large  values  for  the  electrical  gain  (>  10^)  could  be  obtained 
with  switching  energies  as  low  as  1.7  pJ  corresponding  to  2.6  fJ/^m^.  The  switching 
power  Pg.^^,  the  electrical  and  optical  gain  and  the  switching  times  are  externally 
adjustable.  We  have  demonstrated  opto-electrical  switching  at  switching  times  ^1.9  ns 
for  a  switching  power  of  880  tiW.  By  forming  a  smart  pixel  composed  of  an  opto- 
electrical  switch  with  photoconductive  gain  and  an  electroabsorptive  n-i-p-i  modulator 
we  were  able  to  obtain  opto-optical  switching  with  low  energies  (^2pJ).  The  optical 
gain  is  adjustable  from  30  to  10^,  the  contrast  of  the  output  signal  was  4.5:1. 
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Fig.  5:  Opto-optical  switching  of  the 
smart  pixel.  For  various  optical  power 
levels  Pj.gf  on  the  reference  diode  the 
diagram  shows  the  output  power  of  the 
n-i-p-i  modulator  controlled  by  the 
much  lower  input  power  on  the  photo- 
conductive  switch. 


0.0  r  .....  I . L....  ,  ,  i 

o  0  5  10  15 


Time  (us) 

Fig.  6:  Time  dependence  of  the  optical 
output  signal  from  the  n-i-p-i 

modulator  modulated  by  the  input  signal 
P^.^  on  the  switch.  For  a  switching 
power  P^^  =  2pW  switching  times  of 
1  ps  are  achieved  (E^.^=2pj).  The 
optical  gain  is  750  and  the  on/off  ratio 
of  the  modulated  signal  is  4.5:1. 


Finally  we  note  that  the  high  saturation  current  of  our  switch  would  also  be  sufficient 
to  operate  another  kind  of  ’’smart  pixels”,  composed  of  our  switch  and  a  vertical 
cavity  surface  emitting  laser  structure  (VCSEL). 

This  work  has  partly  been  supported  by  the  German  Ministry  of  Research  and 
Technology  (BMFT)  under  (TK  0567  O).  The  collaboration  with  Micronic  laser 
Systems  AB  regarding  the  mask  written  by  the  Laser  Pattern  Generator  LBG  15P 
is  gratefully  acknowledged. 
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Abstract 

Above  the  bandgap  energy,  the  combination  of  the  absorptive  and  dispersive 
nonlinearities  lead  to  a  new  type  of  bistable  device.  We  present  this  device  and 
determine  the  changes  of  the  nonlinear  refractive  index  and  of  the  absorption  with 
respect  to  the  incident  light  power. 


1.  Introduction 

We  have  presented  earlier  the  bistable  switching  of  an  integrated  nonlinear  Fabry-Perot 
device  (NLFP)  with  a  low  threshold  (~  1  mW)  and  high  contrast  [1].  The  structure,  an  optical 
nonlinear  spacer  of  approximately  2  |J.m  length  of  intrinsic  GaAs  sandwiched  between  two 
AlGaAs  dielectric  mirrors,  was  designed  for  operations  at  885  nm.  Since  the  stopband  of  the 
dielectric  mirrors  is  broad  (1 15  nm  for  the  back  mirror),  the  NLFP  can  also  be  operated  on  a 
broad  spectrum  of  wavelengths,  even  well  below  the  wavelength  of  the  bandgap 
(^gap  ~  ^21  nm). 


2.  Optical  bistability  above  the  gap  energy 

Below  7.gap,  the  Fabry-Perot  resonance  has  a  very  low  finesse  at  low  incident  power,  since  the 
back  mirror  is  masked  by  the  high  absorption  cavity.  At  higher  incident  power,  a  decrease  in 
the  absorption  is  induced,  increasing  finesse,  and  first  a  positive  then  a  negative  change  of  the 
refractive  index  will  shift  spectrally  the  resonance  [2].  Figure  la  reproduces  the  changes  in  the 
reflectivity  spectra  due  to  the  increase  of  the  incident  power.  Optical  bistability  is  observed  at 
wavelength  X  =  853  nm  for  55  and  61  mW  incident  power  (jump  from  high  to  low  reflectivity 
and  a  quasi-infinite  differential  reflectivity).  The  finesse  of  the  resonance  increases  from  less 
than  2  at  low  power  to  more  than  9  at  61  mW.  The  same  behavior  for  the  finesse  was  found  in 
simulation  (plane  wave  model,  absorption  and  refractive  index  changes  interpolated  from 
experimental  data  [2]). 

Figure  lb  shows  the  bistable  switching  behavior  above  the  gap  {X  ==  853  nm)  at  high 
incident  power.  A  55  mW  power  beam,  to  which  a  short  pulse  (100  ns)  of  12  mW  is  added. 
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induces  latched  switching  which  lasts  for  more  than  300  ns.  The  switch-off  is  caused  by  the 
positive  refractive  index  change  induced  by  heating.  The  measured  contrast  at  switch-on  is 
3.5:1.  Switch-on  time  is  detector  limited  (10-20  ns). 


Fig.  1  a)  Spectral  reflectivity  measurements  of  a  Fabry-Perot  resonance,  for  incident 
powers  from  60  |iW  to  61  mW  on  a  6  )im  spotsize.  The  change  in  the  resonance 
wavelength  is  sketched  at  the  left  of  the  figure. 

b)  Bistable  (latched)  switching  above  the  gap  energy.  Top:  power  of  input  and 
output  beams  vs.  time.  Bottom:  output  power  vs.  input  power. 


3.  Absorption  and  refractive  index  change  investigated  above  the  energy  of  the  gap 


We  deduced  the  absorption  and  the  refractive  index  change  variations  from  nonlinear  spectral 
reflectivity  measurements  (SRMs)  (Fig.  2).  The  optical  constants  vary  with  respect  to 
different  incident  powers  of  the  laser  beam  (0  to  15  mW)  and  with  the  different  operating 
wavelengths  (between  847  and  867  nm  in  our  measurements).  From  fitting  the  formula  for  the 
reflectance  of  the  Fabry-Perot  with  absorption  to  the  SRMs,  we  determined  the  minimum  of 
the  low  finesse  resonances  and  their  corresponding  absorption.  The  nonlinear  refractive  index 
change  is  deduced  from  the  wavelength  shift  of  the  resonance  [3].  The  internal  light  intensity 
in  the  low  finesse  cavity  is  determined  from  the  incident  intensity  li,  the  reflectivity  of  the 
device  R,  and  the  absorption  following  the  relation 


l^R 

aL 


where  L  is  the  length  of  spacer.  This  relation  is  obtained  from  the  equation  for  the  reflectivity 
of  a  Fabry-Perot  including  the  spatial  averaging  of  the  intensity  in  the  cavity  [4]. 


Absorption  [1/pm] 
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Fig.  2  Changes  in  absorption  (top)  and  in  the  refractive  index  (bottom)  with 
respect  to  the  incident  power  on  a  12  pm  spotsize.  The  points  are 
determined  from  nonlinear  SRMs  at  photon  energies  above  the  gap  on  the 
sample  #337.  The  number  of  points  indicated  for  X  =  849.7  nm  is  the 
number  of  SRMs  used  to  get  the  curve. 

Phenomenological  coefficients  0,2  and  n2  are  deduced,  characterizing  the  linear 
absorption  and  refractive  index  change  at  different  wavelengths  with  respect  to  the  internal 
intensity.  The  values  of  the  phenomenological  coefficients  are  02  =  322  cm/kW  {X  =  853  nm) 
and  n2  ^  2-10-3  cm^/kW  {X  between  864  and  867  nm,  incident  power  >  2.5  mW).  n2  is  30 
times  higher  than  the  phenomenological  Kerr  coefficient  n2  ==  -7T0'5  cm^/kW  determined  at  a 
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photon  energy  below  the  gap.  This  is  related  to  the  30  times  higher  absorption  of  7-10^  cm‘^ 
measured  above  the  gap  in  comparison  to  a  =  250  cm'l  below  the  gap  [4]. 


4.  Discussion 

The  above  gap  bistability  requires  higher  powers  (61  mW)  than  the  below  gap  bistability 
(1  mW)  due  to  the  requirements  that  the  absorption  be  bleached  and  that  the  dispersive 
refractive  index  change  be  sufficiently  negative.  However,  there  may  be  an  application  for 
such  a  type  of  bistability  in  cascaded  systems  where  an  "upconverter"  etalon  is  required  [5]. 
Two-wavelength  logic-gate  etalons  are  operated  with  a  signal  wavelength  which  is  longer 
than  the  pump  (bias)  wavelength.  For  example,  the  signal  beam  can  be  on-resonance  and  be 
more  efficiently  absorbed  than  the  pump  wavelength  which  is  detuned  from  resonance 
towards  shorter  wavelengths.  The  upconverter  would  work  in  the  opposite  sense  with  signal 
wavelengths  shorter  than  the  pump  wavelength  and  thus  allow  cascadability.  We  have  here 
shown  (as  is  predicted  from  the  curves  in  [2])  that  it  is  possible  to  bleach  the  absorption  above 
gap  over  a  range  of  wavelengths.  Therefore,  it  is  possible  to  chose  the  pump  wavelength 
detuned  from  resonance  as  before  and  then  position  the  signal  wavelength  shorter  than  this.  In 
order  that  the  pump  wavelength  of  the  upconverter  is  the  same  as  the  signal  wavelength  of  the 
logic-gate  etalon,  the  bandgaps  of  the  material  chosen  in  the  two  cases  would  have  to  be 
different  (as  suggested  in  [5]). 

t  Present  affiliation  of  C.  Bagnoud  is  the  Dept,  of  Electrical  and  Computer  Engineering, 
UCSD,  La  Jolla,  California  92093-0407,  USA. 
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two  mutually  complementary  outputs 


Masanobu  Watanabe,  Seiji  Mukai,  and  Hiroyoshi  Yajima 

Optical  Information  Section,  Electrotechnical  Laboratory, 
Umezono,  Tsukuba,  Ibaraki,  305  Japan 


Abstract 

Theory  and  experiment  on  crosscoupled-mode  bistability  in  a  twin-stripe  laser  is  reported. 
The  laser  has  two  output  ports  complementary  to  each  other,  which  is  analogous  to  a  set- 
reset  flipflop  in  electronics.  Limitation  on  the  cavity  length  is  also  discussed. 


1.  Introduction 

A  crosscoupled  field  which  diagonally  crosses  from  one  stripe  to  the  other  in  a  twin-stripe  laser 
was  first  experimentally  observed  by  White  and  Carroll  in  1983  [1]  and  the  mechanism  for  its 
generation  was  theoretically  explained  by  Watanabe  et  al  in  1990  [2].  The  theory  showed  that 
such  an  asymmetric  field  with  its  pattern  at  a  facet  being  the  mirror  image  of  that  at  the  other 
facet  can  be  obtained  with  completely  symmetric  structure  and  symmetric  current  injection. 
Consequently,  the  laser  should  show  bistability  between  two  crosscoupled  modes.  The  optical 
power  distribution  of  the  bistable  states  is  similar  to  the  voltage  distribution  of  a  set-reset 
flipflop  in  electronics  [3],  and  hence,  useful  for  optical  switching  and  logic  operation. 

Here,  theory  and  experiment  on  crosscoupled-mode  bistability  in  a  twin-stripe  laser  is 
reported.  Switching  indicating  bistability  between  two  crosscoupled  modes  were  observed. 


(a)  Fluctuation  leads  to 
slightly  asymmetric 
pattern  at  a  facet 


(b)  After  one-way 
propagation 


(c)  Final  stable  state 


(d)  Set-reset  flip-flop 


Fig.  1  Process  to  the  crosscoupled  mode  operation 
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2.  Theory 


Fig.l  illustrates  top  views  of  a  twin-stripe  laser  to  show  how  a  cross-coupled  mode  [1]  builds 
up  [2-4].  Assume  that  the  light  power  and  the  earners  are  mainly  in  each  of  the  two 
waveguides  under  the  stripes,  the  current  is  uniformly  injected  into  the  stripes,  and  the  cavity 
length  is  near  to  the  coupling  length  of  the  twin  waveguide.  The  explanation  starts  with 
symmetric  light  pattern  at  both  facets.  Suppose  a  fluctuation  which  leads  to  a  slighdy 
asymmetric  pattern  near  a  facet  (z=0)  as  shown  in  Fig.  1  (a).  This  field  pattern  changes  during 
one-way  propagation  to  the  other  facet  (z=L),  where  it  becomes  nearly  the  mirror  image  (left 
and  right  is  reversed)  of  the  light  pattern  at  z=0,  as  shown  in  Fig.  1(b).  During  this 
propagation,  this  light  consumes  more  carriers  at  the  right-lower  and  left-upper  regions  of  the 
twin  waveguide  than  at  the  other  regions.  We  should  now  examine  how  the  resultant  diagonal 
carrier  distribution  reacts  on  the  light  field  in  the  next  step. 

Intuitively,  one  may  expect  that  the  carrier  distribution  will  enhance  the  light  power  at 
the  regions  with  high  carrier  density,  which  makes  the  light  pattern  return  to  the  symmetric 
shape  because  of  negative  feedback.  On  the  contrary,  however,  it  was  theoretically  shown  that 
the  diagonal  carrier  distribution  enhances  the  light  power  at  the  regions  with  low  carrier 
density  and  makes  the  original  light  pattern  more  asymmetric,  if  the  cavity  length  is  shorter 
(longer)  than  the  coupling  length  for  twin-stripe  lasers  with  low  (high)  inter-stripe  gain  [2-4]. 
This  happens  because  of  both  lateral  and  longitudinal  resonance,  and  the  light  confinement 
difference  between  the  resonant  lateral  modes. 

An  example  of  light  power  distribution  of  resonant  lateral  modes  are  shown  in  Fig.2 
which  were  calculated  for  the  diagonal  carrier  distribution  shown  in  Fig.  1(b).  Here,  the 
interstripe  gain  was  set  to  be  zero,  the  cavity  length  L  was  set  to  be  95  %  of  coupling  length, 
and  the  carrier  density  difference  between  two  regions  was  set  to  be  2.4  %  of  the  average 
carrier  density.  In  this  example,  the  mode  in  Fig.2(a)  has  the  higher  gain  than  the  other 
because  of  the  higher  light  confinement,  in  spite  that  the  highest  peak  is  in  the  low  carrier 
density  side.  Therefore,  the  diagonal  ctirrier  distribution  supports  the  mode  with  its  highest 
peak  in  the  low  carrier  density  side,  which  leads  to  positive  feedback. 

The  light  and  the  carrier 


distributions  enhance  the  asymmetries 
of  each  other  due  to  the  positive 
feedback  and  finally,  both  have 
substantially  asymmetric  patterns  as 
illustrated  in  Fig.  1(c).  If  the  original 
light  fluctuation  is  such  that  the  left 
peak  (instead  of  the  right  peak  as 
shown  in  Fig.  1(a))  becomes  larger  at 
z=0,  then  another  stable  state  with 
mirror  images  of  the  light  and  the 
carrier  distributions  shown  in  Fig.  1(c) 
is  obtained. 


Thus,  there  are  two  stable  states 
drawn  in  Fig.  1(d)  in  solid  and  broken 
curves,  respectively.  The  laser  can  be 
switched  from  a  crosscoupled  state  to 
the  other  by  light  injections.  The 
operation  is  similar  to  a  set-reset  flip- 
flop  in  electronics  particularly  in  that  the 
laser  has  two  mutually  complementary 
outputs,  while  bistable  semiconductor 
lasers  reported  so  far  have  usually  only 
one  output  and  hence  require  an 
additional  inverter  to  get  the 
complementary  output.  Thus,  twin- 
stripe  lasers  in  crosscoupled-mode 
operation  should  be  useful  for  optical 
switching  and  logic  operation. 


X  (fim) 


Fig.2  Power  distribution  of  resonant 
lateral  modes 
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3.  Experiment 

Fig.3  shows  the  cross  section  of  the 
lasers  made  for  measurement  which  has 
a  similar  structure  with  that  in  [5].  It 
has  an  ordinary  double  heterostructure  g.  ^ 

with  a  0. 1  jJ-m  thick  GaAs  active  layer  \ 
sandwiched  by  two  AIq  35GaQ  55AS 

cladding  layers.  Two  2-pm-wide  stripe 
anodes  with  4  p-m  spacing  were  formed 

on  4-|J.m-wide  mesas  made  by  wet 
etching.  The  wafer  was  cleaved  to  — 

make  lasers  with  typical  cavity  length  of 

500  pm  which  is  estimated  to  be  near  to 
the  coupling  length. 

Fig.4(a)  and  (b)  show  near-  H 
field  patterns  measured  with  the  left 
and  right  currents  fixed  at  90mA  and 
95mA,  respectively  [6].  They  show 
the  two  crosscoupled  modes  whose 
patterns  are  the  mirror  images  of  each 
other.  When  we  measured  the  field 
patterns  with  pulse  drive,  Fig.  1(a)  was 
obtained  at  some  pulses  and  (b)  was  obtained 
at  the  other  pulses.  This  clear  switching 
with  nomin^ly  identical  pulses  indicates 
bistability  between  two  crosscoupled  modes. 


Thickness 

(/^m) 


n-GaAs 

SUBSTRATE 


Fig.3  Cross  section  of  the  measured 


4.  Cavity  length  limitation 


The  cavity  length  should  be  near  to  the  ^  \ 

coupling  length  of  the  twin  waveguide  to  give  -  vrn 

a  highly  asymmetric  light  patterns.  To  reduce  (b) 

the  laser  length  for  crosscoupled  mode 
generation,  the  coupling  between  the  two 

waveguides  should  become  stronger.  The  III  I  _ 

strongest  limit  is  that  the  two  waveguides  are  T  \ 

combined  into  one.  Therefore,  the  possibly  v 

shortest  laser  length  can  be  estimated  by  the  1  ^ 

propagation  constant  difference  between  the  ' 
fundamental  and  the  first-order  lateral  modes 
of  a  single  waveguide. 

For  a  simple  slab  waveguide,  the 
larger  the  permittivity  difference  between  the 
core  and  the  clad,  the  shorter  the  minimum 
coupling  length  and  the  thinner  the  required 
core  thickness.  For  GaAs  core  and 
AIq  35^^0  65^^  whose  permittivities  Fig. 4  Measured  near-field  patterns 

are  13.1  and  1 1.4,  respectively,  the  minimum 

coupling  length  is  2.7  pm  while  the  required  core  thickness  is  0.2  pm.  Here,  the  permittivity 
reduction  of  around  0.1  in  the  active  layer  was  considered  assuming  the  carrier  density  of 
2xl0^^/cm^.  In  this  very  limit  condition,  however,  the  crosscoupled  mode  does  not  have  a 
clear  asymmetric  pattern,  as  explained  below. 
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To  see  how  far  we  should  stay  away  from  the  limit  condition,  a  simple  calculation  was 
made  for  looking  at  the  supermode  patterns  and  the  coupling  length  for  various  coupling 
strength.  Fig.5  shows  some  of  the  results  for  simple  GaAs  -  AIq  35GaQ  55AS  twin  slab 

waveguides  where  the  core  thickness  was  fixed  to  0. 1  |im  (0.2  |im  when  they  are  combined). 

The  coupling  was  changed  by  the  waveguide  separation.  For  large  separation  (>0.4jim),  the 
crosscoupled  mode  can  have  a  clear  asymmetric  pattern  with  its  power  predominantly  at  one 
side  because  the  two  supermodes  which  compose  the  crosscoupled  mode  have  similar  power 
distribution  to  each  other  and  can  well  cancel  each  other  at  the  other  side.  At  the  limit  condition 
with  zero  waveguide  separation,  the  crosscoupled  mode  does  not  have  a  clear  asymmetric 
pattern,  because  the  first-order  mode  is  at  the  transition  point  to  cut-off  and  its  power  spreads 

to  outside  much  more  than  the  fundamental  mode.  Consequently,  around  10p.m  cavity  length 
is  required  for  clear  spatial  switching  due  to  crosscoupled  mode  operation  in  a  GaAs  - 
35^^0  65^^  waveguide. 


POSTION  (u.m) 


Fig.5  Power  distribution  of  the  fundamental  and  the  first-order  super  modes  for  various  waveguide 
separation.  The  permittivity  distribution  is  also  shown  at  the  top  of  each  figure. 

The  separation,  d,  and  the  resultant  coupling  length,  Lc,  are; 

(a)  d=0.8pm,  Lc=86)im,  (b)  d=0.6pm,  Lc=39pm,  (c)  d=0.4pm,  Lc=17)i,m,  (d)  d=0.2pm,  Lc-7pm. 
5.  Summary 

Mechanism  for  the  crosscoupled  mode  generation  was  theoretically  explained  and  two 
crosscoupled  modes  were  observed  experimentally  with  nominally  identical  current  pulses. 
This  indicates  bistability  which  is  useful  for  set-reset  flipflop  operation  with  two  mutually 
complementary  outputs.  The  required  cavity  length  to  obtain  clear  spatial  switching  was 

estimated  to  be  around  10  pm  for  a  GaAs-Alg  35GaQ  (^5  As  twin  slab  waveguide. 
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Abstract.  Wc  report  a  novel  optical  bistable  device,  which  consists  of  a  double-barrier  resonanl- 
tuiincling  diode  tuid  a  triangulai'-banier  phototransistor.  Clear  negative  dilTerential  resistance  and 
optical  bistability  with  high  contrast  luid  high  sensitivity  arc  demonstrated. 


1. Introduction 


An  optical  bistable  device  is  a  key  lor  optical  computing,  and  a  lot  of  devices  have  been 
studied  so  far  for  this  purpose.  As  a  photodetecting  element  in  the  devices  several  kinds  of 
structures  such  as  a  conventional  hcterostructurc  phototransistor  (HFT)  have  been  studied  [1 J. 
As  another  candidate,  a  triangular-barrier  phototransistor  (TBP)  with  a  very  thin  gate  layer, 
which  has  been  developed  in  GaAs  /  AlGaAs  [2J,  is  very  promising  Irom  a  view  point  ol  high 
speed  and  high  sensitivity.  On  the  other  hand,  a  resonant-tunneling  diode  (RTD)  has  a 
potential  to  give  rise  to  unique  functions  attributed  to  its  N-shaped  negati^■c  dillerential 
resistance  (NDR)  and  high  speed  operation  [3].  For  these  reasons,  the  integration  of  a  double¬ 
barrier  RTD  (DB-RTD)  and  a  TBP  seems  very  interesting  to  make  a  novel  optical  bistable 
device. 

On  the  other  hand,  1  [.im-wavelength  range  operation  of  these  devices  is  desirable  for 
the  matching  to  optical  communication.  In  order  to  fabricate  the  devices  with  InP-reJated 
materials  for  this  purpose,  a  gas  source  molecular  beam  epitaxy  (GSMBE)  should  be  a 
promising  technique  [4J,  because  sharp  interfaces  and  a  good  b-doped  layer  are  attained, 
which  are  inevitable  for  the  device. 

In  this  paper,  we  propose  Resonant-tunneling  Triangular-barrier  Optoelectronic  Switch 
(R-TOPS)  grown  by  GSMBE,  and  the  fabrication  and  the  fundamental  experimental  results 
are  reported  at  4im-wa\  elength  range  operation. 


2.Device  structure 

Figure  1  shows  a  band  diagram  of  the  device  fabricated.  It  was  composed  ol'  InGaAs  / 
(In)AlAs  for  1.5  pm  wavelength  range  operation.  The  TBP  had  n+-i-bp+-i-n+  structure,  and  an 
i-InAlAs  region  was  designated  as  a  source,  a  5p+-InGaAs  region  as  a  gate,  and  an  i-InGaAs 
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DB-RTD 

TBP 

-AlAs  i-InGaAs  ri^-lnP 


InGaAs  InAIAs  InAlAs  InGaAs  InGaAs  InGaAs 


Fig.l.  Band  Diagram 


Fig. 2. Schematic  load-line  graph 
for  R-l'OPS 


region  as  a  drain.  When  an  input-light  is  incident  to  the  device  with  the  drain  positively 
biased  and  is  absorbed  in  the  drain,  the  generated  holes  move  to  the  gate  layer  and  lower  the 
fX)tential  barrier  of  the  gate.  This  makes  majority  carriers,  electrons  in  this  case,  flow  over  the 
potential  barrier  from  the  source  to  the  drain.  The  majority  carrier  How  is  due  to  thermionic 
emission  because  of  the  ultrathin  gate  layer,  so  the  current  versus  voltage  characteristic  shows 
an  exponential-like  curve,  and  high  optical  sensitivity  and  high  speed  operation  are  expected 
[2].  As  for  the  DB-RTD  a  well  layer  and  barrier  layers  were  i-lnGaAs  and  i-AlAs, 
respectively,  and  had  i-InGaAs  spacer  layers  outside  the  barrier  layers.  It  was  reported  that 
this  structure  shows  a  large  peak-to-valley  current  ratio  at  room  temperature  because  of  the 
ver}'  large  barrier  height  [5].  The  TBP  and  the  DB-RTD  w'crc  integrated  with  an  n+-InP  etch 
stop  layer  in-between.  Since  the  DB-RTD  has  N-shaped  NDR  characteristics,  and  the  TBP 
has  exponential-like  dependence  of  current  upon  voltage  varied  by  input-light  power,  it  is 
predicted  that  the  R-TOPS  shows  a  new  class  of  bistability.  And  the  output-current  change  is 
expected  to  be  large  because  the  TBP  has  a  small  differential  resistance.  Figure  2  shou  s  a 
schematic  load-line  graph  for  the  R-TOPS.  A  thick  curve  is  the  characteristic  of  the  DB-RTD, 
and  three  thin  curves  arc  the  ones  of  the  TBP  at  different  input-light  pow'ers  P|,  P2  ^-^nd  P3, 
which  show  exponential-like  dependence  of  current  upon  voltage  due  to  thermionic  emission. 
By  changing  the  input-light  power,  the  intersections  vary  as  shown  in  Fig. 2. 

The  device  was  grown  on  a  (100)  n+-InP  substrate  by  GS-MBE,  in  which  1{)()%  ASH3 
and  PH3  were  used  for  group  V  sources,  and  silicon  and  bcryHiiim  were  used  as  n-type  and 
p-type  dopants,  respectively.  The  growth  temperature  was  530°C  for  InGaAs  and  (In)AIAs. 
The  device  structure  consisted  of  n+-Ino53Ga()47As  (0.098pm,  2.3x10’^  enr^),  i- 
In 0.33 Gao. 47 As  (0.79pm),  bp+-In 0,53 Ga 0.47 As  (59A,  3.7x10'^  cut^),  i-l  11032 Alo.asAs 
(0.041pm),  n +-In 0,52 A1 045.5 As  (0.10pm,  2.2x10*^  enr^O,  n+-Ino,33Gao47As  (0.034pm, 
2.3xl0i«  enr^),  n^-InP  (56A,  4.2x  lO^^  enr^),  n+-Ino,33Gao.47As  (0.099pm,  2.3xl()i<^  enr^O,  1- 
Ino.53Gao.47As  (12A),  i-AIAs  (30A),  i-Ino,53Gao.47As  (32A),  i-AIAs  (30A),  i-Ino33Gao47As 
(12A)  and  n+-Ino,33Gao.47As  (0.099pm,  2.3x101^  cur-").  The  dc\’icc  was  etched  into  a  mesa 
with  60  pm(().  Contact  to  the  top  layer  was  Firmed  by  evaporating  Au  /  Sn  in  the  shape  of  an 
open  ring  structure  using  lift-off  technique  with  SiN^  passii  ation.  Input-light  was  illuminated 
to  the  top  of  the  de\  ice  with  a  lensed  fiber. 


CURRENT  (mA) 
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0  1.0  2.0  3.0 


VOLTAGE  Vb  (V) 

Fig.3.I-V  characteristics  at  different 
input -light  powers 


0  20  40  60 

INPUT-LIGHT  POWER  (nW) 


Fig.4.0utput-cunent  vs.  input-light 
power  at  different  bias  voltages 


3.Characteristics 

Experimental  I-V  characteristics  at  different  input-light  powers  at  room  temperature  are 
shown  in  Fig.3.  The  substrate  side  was  normally  positively  biased  at  and  the  wavelength 
of  input-light  was  1.55  {.im.  Clear  N-shaped  NDR  with  bistability  (‘electronic  bistability’)  was 
observed,  and  a  peak-to-valley  current  ratio  and  a  peak  current  density  were  3.6  and  910 
A/cm^  ,  respectively.  The  peak  and  the  valley  currents  were  determined  by  those  of  the  RTD, 
so  these  were  almost  independent  of  the  input  light  power.  On  the  other  hand,  as  the  input- 
light  power  increased,  the  peaks  moved  to  lower  voltage  side,  which  w'as  due  to  the  decrease 
of  series  resistance  of  the  TBP.  At  the  same  time  ,  hysteresis  widths  were  also  varied.  This 
was  attributed  to  the  slope  characteristics  of  the  TBP  depending  on  the  input-light  power. 
These  properties  show  that  both  TBP  and  DB-RTD  worked  well  as  expected,  even  when  they 
w'ere  integrated.  The  bistability  in  the  relation  between  input-light  power  and  output-current 
(‘optoelectronic  bistability’)  at  room  temperature  was  obtained  as  shown  in  Fig. 4.  It  operated 
at  the  input-light  power  as  small  as  50pW  or  even  smaller,  and  the  output-current  change  was 
about  10mA,  which  was  large  enough  to  drive  other  optical  devices  such  as  a  laser  diode. 
Since  the  hysteresis  width  was  about  30pW,  switching  could  be  attained  by  the  input-light 
swing  of  no  more  than  20[.iW  with  an  appropriate  bias  light.  We  also  realized  ‘optical 
bistable’  operation  by  connecting  the  R-TOPS  and  a  laser  diode  as  shown  in  Fig. 5.  A  high 
contrast  ratio  about  13dB  and  an  intrinsic  optical  gain  of  about  6dB  from  an  input  fiber  to  an 
output  fiber  were  obtained.  These  characteristics  show  the  possibility  ol  optical  XOR  logic 
and  memory  operation.  In  practice,  dynamic  properly  was  measured  as  shown  in  Fig. 6.  An 
upper  trace  is  the  input-light,  and  a  lower  trace  is  the  output-current.  The  input-light  was 
modulated  by  both  set  and  reset  pulses  with  a  bias  input-light.  The  repetition  rale  was  500 
kHz,  and  the  pulse  width  was  150ns.  Due  to  the  utilization  of  N-shaped  NDR,  the  set 
operation  was  done  by  the  decrease  of  inpul-light  power,  and  the  reset  one  by  the  reverse. 
Good  dynamic  memory  operation  was  obtained  with  a  switching  energy  of  as  small  as  4.5pJ, 
corresponding  to  1.6  fJ/pm^.  It  is  possible  to  improve  the  switching  speed  of  this  device  by 
optimizing  device  parameters  and  measurement  circuit. 


OUTPUT-LIGHT  POWER  (^W) 
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I  'i". 5. Optical  bistable  cliaiaclcristics 
at  dilTcrcnt  bias  \()lta<jes 


l■i”UI■e  6.  ])\  iianiic  iiK’nioi  )  openilion 


4.ConcIusion 

In  conclusion,  vve  fabricaled  a  no\cj  R-TOPS  by  GSMBE,  and  clear  N-shaped  NOR  and 
bistability  were  obtained.  And  dynamic  memory  operation  was  also  demonsirtitcd.  By 
introducing  Itght  emitting  or  modulating  elements,  we  can  rcali/e  all  optical  bistable  de\  ice. 
The  R-TOPS  is  expected  to  play  an  important  role  for  optical  computing. 
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Abstract.  We  report  a  novel  optical  functional  device,  rriangtihu'-bairier  Optoelectronic  Switch 
(TOPS),  which  consists  of  a  triangular-barrier  phototrtmsistor  with  avalanche  multiplication.  Clear 
differential  gain,  bistability  and  latch  characteristics  with  high  sensitivity  and  high  gain  were 
obtained  by  only  chtuiging  Ute  bias  voltages. 


1. Introduction 


In  order  to  realize  optical  computing,  various  kinds  of  functions  are  needed  lor  optical 
devices  such  as  logic,  memory  and  so  on.  For  these  purposes,  an  optoelectronic  switch 
showing  nonlinear  optical  response  is  quite  promising,  and  several  kinds  ol  devices  ha\'e  been 
studied  so  far.  Among  those  devices,  the  p-n-p-n  double-heterostructurc  optoelectronic  switch 
(DOES)  exhibited  optically  controlled  switching  [1,2].  The  hetcrojunction  bipotar 
phototransistor  (HPT),  which  consisted  of  n+-pM-n+  structure,  also  showed  similar 
characteristics  |3|.  These  devices  utilized  avalanche  breakdown  in  the  reverse-biased  region 
for  switching.  And  once  the  breakdown  occurred,  it  was  hardly  possible  to  reset  it  to  the  oil- 
state  optically,  unless  a  relatively  high  impedance  resistor  of  about  \kQ  was  connected 
externally,  which  was  a  disadvantage  for  high  speed  operation  [21.  This  was  because  the  on- 
state  I-V  characteristics  of  these  devices  were  almost  independent  of  input-light  due  to 
inherent  properties  of  p-n  junctions.  If  some  structure  whose  I-V  characteristics  depend  on 
input-light  power  can  be  utilized,  switching  may  be  I'ully  controlled  by  light,  in  other  words, 
optical  reset  is  also  expected.  One  of  the  possible  candidates  for  such  a  structure  is  n+-i-5p+-i- 
n'^  structure. 

This  structure  as  a  triangular-barrier  pholotransistor  (TBP)  with  a  very  thin  gate  layer 
has  been  developed  in  GaAs  /  AlGaAs,  and  is  very  promising  from  a  view  point  of  high  speed 
and  high  sensitivity  [41.  But  only  conventional  functions  as  a  phototransistor  were  reported. 
In  order  to  apply  a  TBP  to  an  optoelectronic  switch,  it  is  important  to  introduce  a\  alanchc 
multiplication  into  a  TBP  .  In  this  paper  we  propose  Triangular-barrier  Optoelectronic  Switch 
(TOPS)  for  1  pm  wavelength  range  operation.  It  consisted  of  InGaAs  /  InAlAs  grow  n  by  gas 
source  molecular  beam  epitaxy  (GSMBE),  which  is  a  promising  technique  for  sharp 
interfaces  and  a  good  b-doped  layer  |51.  We  succcssl Lilly  demonstrated  flexible  opUcdl 
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functions  of  the  TOPS  associated  with  avalanche  multiplication.  To  our  knowledge,  this  is  the 
first  report  on  optically  controllable  optoelectronic  switch  by  a  TBP. 


2.Device  structure 

Figure  1  shows  a  band  diagram  of  the  TOPS  biased.  It  had  n+“i-bp+-i-n+  structure,  and  was 
composed  of  InGaAs  /  InAlAs  for  1.5  pm  wavelength  range  operation.  The  device  structure 
consisted  of  n+-Ino.53Gao.47As  (0.  lOpm,  2.0x10^^  cm-^),  i-Ino.53Gao.47As  (0.84pm),  5p+- 
Ino,53Gao.47As  (82A,  1.2xl0i8  cm-3),  i-Ino.52Alo.48As  (0.041pm),  n+-Ino.52Alo.48As  (0.10pm, 
2.3x10^8  cm-3)  and  n Mn 0.53 Ga 0.47 As  cap  layer  (0.034pm,  2.0x10^^  cm -3).  An  i-InAlAs 
region,  a  6p+-InGaAs  region  and  an  i-InGaAs  region  were  designated  as  a  source,  a  gate  and  a 
drain,  respectively.  The  device  was  grown  on  a  (100)  n+-lnP  substrate  by  GSMBE,  in  which 
100%  AsHs  was  used  for  a  group  V  source,  and  silicon  and  beryllium  were  used  as  n-type 
and  p-type  dopants,  respectively.  The  growth  temperature  was  500°C,  and  the  typical  growth 
rale  was  0.4pm/h.  The  device  was  formed  into  a  mesa  structure  with  60  pm4)  .  Contact  to  the 
top  layer  was  achieved  by  evaporating  Au  /  Sn  to  form  an  open  ring  structure  using  lift-off 
technique  with  SiN,;  passivation.  An  input  -light  was  introduced  to  the  lop  of  the  dc^■icc  with 
a  Icnsed  fiber,  and  the  wavelength  of  the  input-light  was  1.55  pm. 

If  the  input-light  is  incident  to  the  device  when  the  drain  is  biased  positively,  electrons 
and  holes  are  generated  in  the  drain,  and  the  holes  move  to  the  gate  layer  and  accumulate  in  it. 
Resultant  lowering  of  the  potential  barrier  of  the  gate  increases  the  majority  carrier  How  of 
electrons  over  the  potential  barrier  from  the  source  to  the  drain  regions  due  to  thermionic 
emission.  So  the  current  vs.  voltage  characteristics  show  exponential-like  curves.  The 
avalanche  multiplication  can  occur  in  the  drain  when  a  device  structure  is  optimized  and  an 
electric  field  in  the  drain  is  large  enough.  Multiplied  holes  further  lower  the  potential  barrier, 
so  the  majority  carriers  How  moreover.  This  positive  feedback  phenomenon  between  the  hole 
generation  and  the  electron  flow  through  the  avalanche  multiplication  gi\es  rise  to 
optoelectronic  switching.  Since  the  output  current  of  the  TBP  is  quite  dependent  on  the  input- 
light  power,  we  can  control  the  switching  by  the  input-light  change.  It  should  be  noted  that 
we  can  stop  the  avalanche  multiplication  by  decreasing  the  input-light  power.  And  \  arious 
functions  can  be  chosen  by  only  changing  bias  voltages. 


Fig.  1 .  Band  Diagram  with  a  bias  ^  ollagc. 


Fig. 2.  1-V  characteristics  at  different 
input-light  powers  {X=  1 .55pm) 
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3.Characteristics 


Experimental  I-V  characteristics  at  different  input-light  powers  observed  at  room  temperature 
are  shown  in  Fig. 2.  Ordinary  I-V  characteristics  of  a  TBP  were  obtained  as  the  input-light 
power  was  more  than  2|liW.  However,  a  signilicant  S-shaped  ncgalive-dillerential-resislance 
(NDR)  was  observed  as  the  input-light  power  was  less  than  2|.iW.  These  NDR  characteristics 
had  almost  the  same  traces  for  the  cases  when  the  input-light  power  increased  and  decreased. 

The  input-light  power  versus  output-current  characteristics  by  changing  the  bias 
voltages  at  room  temperature  were  shown  in  Fig. 3  (a)~(c).  We  obtained  three  types  ol 
functional  operations  by  just  changing  the  bias  voltages  without  any  additional  external 
resistance.  At  a  bias  voltage  of  2.30  V,  differential  gain  characteristic  was  obseio  ed  as  shown 
in  Fig.3  (a).  It  is  noted  that  the  step  was  very  sharp  at  the  input-light  power  ol  as  small  as 
600nW,  and  the  output  current  was  about  6mA.  A  corresponding  optical  gain  was  estimated 
to  the  7400.  We  can  apply  this  mode  for  AND  or  OR  logic.  When  the  bias  voltage  increased 
to  be  2.35  V,  it  was  changed  to  bistable  characteristic  as  shown  in  Fig.3  (b).  The  turn-on 


(a)  Differential  gain  characteristic. 
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Fig.3.  Input-light  power  vs.  output-current  Fig. 

characteristics  by  changing  bias  voltages  Vb 


.4.  Dynamic  memory  operation  by 
bistable  characteristic 
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input-lighl  power  was  decreased  to  be  45()nW,  so  the  cslimalcd  optical  gain  was  12()()().  A 
hysteresis  width  was  varied  depending  on  the  bias  voltage.  So  this  mode  is  good  I'or  memory 
operation.  At  a  larger  bias  voltage  of  2.40  V,  latch  characteristic  was  attained  as  shc^wn  in 
Fig. 3  (c).  An  optica!  gain  in  this  case  was  as  large  as  17()()()  at  350  nVV  turn-on  input-light 
power,  so  it  can  work  as  switch  operation.  We  can  reset  the  holding  state  by  decreasing  the 
bias  voltage  in  this  latch  mode. 

Since  an  output  current  density  oh  the  TOPS  was  as  large  as  200  A/cm^,  we  can  rcali/.e 
all  optical  functional  de\  ices  such  as  an  optical  logic  device,  an  optical  memory  and  so  on  by 
integrating  a  light-emitting  ora  light-modulating  dc\ices  with  it. 

Preliminary  dynamic  memory  operation  based  on  the  bistable  characteristic  was 
obtained  at  the  repetition  rate  of  lOOkHz  as  shown  in  Fig. 4.  An  upper  trace  is  the  input-light, 
and  a  lower  trace  is  the  output-current.  The  input-light  w^as  modulated  by  both  set  and  reset 
pulses  with  a  bias  input-light.  And  it  is  possible  to  impixwe  the  switching  speed  cd'  this  dc\  ice 
by  optimizing  device  parameters  and  measurement  circuit. 


4.Conclusion 

We  demonstrated  a  no\’el  optical  functional  dc^'icc,  Triangular-barrier  Optoelectronic  Switch 
(TOPS),  which  consisted  of  a  triangular-barrier  phototransistor  w  ith  avalanche  multiplication. 
Clear  differential  gain,  bistability  and  latch  characteristics  were  obtained  by  only  changing 
the  bias  \  oltages,  while  the  input-light  power  was  less  than  600  nW  and  the  optical  gain  was 
more  than  7000  .  We  can  apply  the  TOPS  to  optical  functional  dc\  iccs  for  optical  logic  and 
optical  memory.  The  TOPS  is  expected  to  play  an  important  role  for  optical  c(unputing. 
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Abstract.  All-optical  bistable  devices  were  demonstrated  by  using  the  longitudinal  mode 
hopping  of  the  laser  diode  and  the  narrow  transmission  spectrum  of  an  interference  filter. 
Since  a  hysteresis  characteristics  exists  in  the  relationship  between  the  wavelength  and  the 
injection  current  of  the  laser  diode,  an  optical  bistability  was  observed  in  this  system. 
Optical  switch-on  and  -off  were  confirmed  by  directly  injecting  a  dye  laser.  In  addition. 
Optical  characteristics  of  Er-doped  GaAs  and  Er-doped  silicate  glass  filters  were 
investigated. 


1.  Introduction 

Optical  bistability  in  laser  diodes  (LDs)  is  a  most  interesting  subject  because  of  its  many 
advantages;  for  example,  such  diodes  have  optical  gain,  can  provide  large  fan-out  and  can 
have  operating  times  less  than  a  sub-nsec  [1,2].  Bistable  optical  devices  were  confirmed  by 
using  the  longitudinal  mode  hopping  of  the  laser  diode  and  the  narrow  absorption  band  of 
erbium  in  an  yttrium  aluminum  garnet  crystal  (Er:YAG),  and  referred  to  as  ORION  (Optical 
logic  devices  using  the  Red  shift  of  a  laser  diode  and  Inversion  in  Optical  absorbers  (or 
filters)  of  a  Narrow  spectral  bandwidth)  [3].  However,  when  using  an  absorber  such  as  an 
Er:YAG  crystal  as  reported  earlier,  no  absorption  wavelength  can  be  utilized  other  than  that 
specific  to  the  absorber.  Furthermore,  there  was  a  disadvantage  that  crystals  like  YAG 
hardly  conform  to  semiconductor.  This  paper  shows  that  the  ORION  device  was 
successfully  demonstrated  by  using  an  artificial  dielectric  interference  filter  in  place  of  the 
natural  absorber  such  as  the  EriYAG.  By  using  the  interference  filters  in  this  study,  it 
becomes  possible  to  use  an  arbitrary  wavelength  and  form  the  filter  on  semiconductors 
using  vacuum  evaporation  with  ease.  The  optical  switch-on  and  -off  phenomena  were 
observed  by  direct  injection  of  an  external  dye  laser  beam  into  the  LD.  In  addition,  the 
natural  absorption  filters  such  as  the  Er-doped  GaAs  and  the  Er-doped  silicate  glass  were 
investigated. 
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2.  Interference  filters 


The  light  source  was  a  high  power  AlGaAs 
laser  diode  operating  in  the  single  transverse 
mode.  The  threshold  current  is  50  mA,  the 
operating  current  at  optical  output  of  30  mW 
is  100  mA  and  the  slope  efficiency  is  0.6 
mW/mA,  Its  oscillation  wavelength  was 
kept  stable  by  adjusting  the  temperature  of 
the  laser-head.  The  dielectric  interference 
filter  was  formed  by  vacuum  evaporating  1 1 
layers  of  866  A  of  Ti02  and  1335  A  of  Si02 
alternately  on  a  glass  substrate  heated  to  300 
°C,  then  placing  2670  A  of  Si02  as  a  spacer, 
and  adding  another  11  layers  of  Ti02  and 
Si02  alternately  with  the  same  thicknesses. 
The  refractive  indices  of  TiOj  and  Si02 
films  were  2.25  and  1.46,  respectively. 
Optical  wave-forms  were  determined  with  a 
digital  oscilloscope  with  a  bandwidth  of  1 
GHz  and  a  photo-detector  with  a  rise  time  of 
90  psec.  An  external  dye  laser  excited  with  a 
nitrogen  (N2)  laser  was  injected  through  a 
beam  splitter  into  the  LD.  The  dye  laser 
produced  a  beam  with  a  central  wavelength 
of  546  nm,  a  spectral  width  of  2  nm  and  a 
pulse  width  of  500  psec. 

Figure  1  shows  a  transmission  spectrum 
of  the  interference  filter,  with  its  center  at 
786.8  nm  and  a  characteristics  of  the  half¬ 
width  being  about  3  nm.  Point  A  and  B 
indicate  central  wavelength  of  A  and  B 
modes  of  the  LD.  Figure  2  shows 
relationship  after  transmission  through  the 
interference  filter  between  the  laser  power 
and  the  injection  current  of  the  LD  in 
continuous  operation  measured  with  an 
optical  power-meter.  It  was  observed  that 
the  bistable  curve  fell  rapidly  at  82  mA  as 
the  injection  current  increased,  and  in 
contrast,  rose  rapidly  78  mA  when  it 
decreased.  Figure  3  (a)  exhibits  the  optical 
intensity  variation  after  transmission  through 
the  filter  when  the  pulsed  dye  laser  in  Fig.  3 


Wavelength  (nm) 

Fig.  1  Transmission  spectrum  of  the 
interference  filter.  Points  A  and  B 
indicate  central  wavelength  of  A  and  B 
modes  of  the  laser  diode. 


Injection  Current  (mA) 


Fig.  2  Laser  intensity  after  transmission 
through  the  filter  versus  injection  current 
of  the  laser  diode  characteristics  showing 
hysteresis. 


(b)  was  injected  into  the  LD  during  continuous  oscillation  at  an  injection  current  of  79  mA. 
The  optical  switch-on  and-off  are  observed  when  the  trigger  light  pulse  of  Fig.  3  (b)  is 
applied.  The  rise  and  fall  time  are  around  1  nsec  and  these  are  limited  by  the  bandwidth  of 
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the  oscilloscope, 

3.  Er-doped  GoAs  and  Silicate  glass  filters 

In  this  section,  optical  characteristics  of  Er- 
doped  GaAs  films  and  Er-doped  silicate 
glass  films  were  investigated.  Details  of  the 
sample  preparation  were  reported  in  Refs  [4, 
5],  Er-doped  GaAs  was  grown  by 
metalorganic  chemical  vapor  deposition. 
Typical  photo-luminescence  (PL)  spectra  of 
Er-doped  GaAs  epitaxial  layers  were 
obtained  at  5.5,  77  and  300K.  At  the 
temperature  of  5.5K,  the  PL  spectrum 
exhibits  two  distinct  peaks  at  1.54  and  1.55 
|im.  The  spectral  bandwidth  of  main  peak  at 
1.54  pm  is  estimated  about  10  nm.  The 
emission  lines  are  related  to  the  internal  4f- 
4f  transition  -  ^1,5/2  of  Er^^.  Figure  4 
shows  a  room-temperature  PL  spectrum  of 
an  Er-doped  silica  glass  film  deposited  on  a 
Si  substrate  using  RF  magnetron  sputtering. 
The  spectrum  is  peaked  at  1.54  pm.  Another 
peak  (-1.55  pm)  is  noted  the  same  as  the 
Er-doped  GaAs  films.  The  spectral  band¬ 
width  of  main  peak  at  1.54  pm  is  estimated 
about  10  nm.  From  these  experimental 
results,  Er-doped  GaAs  and  Er-doped 
silicate  glass  have  potential  applications  in 
optical  filters  of  narrow  spectral  bandwidth 
suitable  for  ORION  devices. 

4.  Switching  time  of  ORION  devices 

We  take  consideration  on  the  switching  time 
of  the  ORION  device.  As  mentioned  above. 


(25nsec/div) 


Fig.  3  (a)  :  Optical  intensity  variation 
after  transmission  through  the  interference 
filter  by  applying  the  trigger  light  pulse 
into  the  laser  diode,  (b):  Trigger  light 
pulse  of  dye  laser. 


Wavelength  (nm) 


the  switching  time  of  this  device  is  duration  pjg  4  Room-temperature  PL  spectra  of 

of  giving  rise  to  the  longitudinal  mode  an  Er-doped  silica  film  deposited  on  a 

hopping  by  injecting  carrier  into  the  LD.  It  silicon  substrate.  The  film  was  deposited 

was  confirmed  however  from  experimental  in  an  Ar/02  (95/5)  ambient  and  annealed 

to  perform  the  switching  response  against  at  850  C  for  30  min. 

laser  with  pulse  width  of  500  psec.  Detailed 
switching  mechanism  has  to  be  still 

explored  whether  the  switching  action  simply  takes  place  during  this  500  psec  or  after 
carrier  accumulation  after  the  500  psec.  Also  for  this  purpose,  accurate  switching  time  must 
be  measured  hereafter,  by  using  a  laser  source  of  short  pulse  and  a  high  speed  photodetector. 
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In  general,  the  longitudinal  modes  of  laser  diodes  are  altered  by  the  temperature  and 
injection  carrier  density  in  the  oscillation  area.  In  DC  operation,  the  injection  carrier  density 
is  constant  for  larger  than  the  threshold  current  and  thus  the  increase  in  the  refractive  index 
caused  by  temperature  rising  becomes  dominant  in  slight  shifting  of  a  series  of  longitudinal 
modes  to  longer  wavelength.  If  the  current  is  increased  further,  the  oscillation  mode  is 
skipped  from  the  center  longitudinal  mode  to  the  mode  with  the  highest  intensity  in  the 
series  of  longitudinal  modes  on  the  longer  wavelength  side  [6].  In  high-speed 
communication  using  shorter  pulse  width  than  the  attenuation  time  of  the  relaxation 
oscillation,  there  are  as  many  as  several  number  of  longitudinal  modes.  The  devices  in  this 
work  is  greatly  dependent  on  the  oscillation  wavelength  of  the  laser  diode,  or  the 
longitudinal  mt^es.  Therefore,  the  changes  of  the  longitudinal  modes  due  to  relaxation 
oscillation  is  an  important  factor  to  determine  the  characteristics  of  the  device.  In  addition, 
the  longitudinal  hopping  has  so  far  been  treated  as  an  undesirable  event  because  it  causes 
noise,  and  there  has  necessarily  been  to  design  suitable  from  the  standpoint  of  users.  It  is 
expected  therefore  that  devices  with  excellent  switching  time  and  characteristics  could  be 
fabricated  if  LDs  optimized  for  making  use  of  the  longitudinal  mode  hopping  would  be 
designed. 

5.  Conclusions 

A  bistable  optical  device  was  demonstrated  by  using  the  longitudinal  mode  hopping  of  LD 
and  a  narrow  transmission  spectral  line  of  an  interference  filter.  Since  a  hysteresis 
characteristic  exists  in  the  relationship  between  the  wavelength  and  the  injection  current  of 
the  LD,  an  optical  bistability  was  observed  in  this  system.  In  addition,  the  optical  switch-on 
and  -off  phenomena  were  observed  by  directly  injecting  a  pulse  500  psec  wide  into  an 
external  dye  laser.  The  Er-doped  GaAs  and  Er-doped  silicate  glass  have  potential 
applications  in  optical  filters  of  narrow  spectral  bandwidth.  It  was  confirmed  that  the  device 
could  convert  changes  in  the  wavelength  of  the  LD  into  changes  in  intensity  transmitting 
through  the  filter  and  would  be  applied  to  an  all-optical  logic  device. 
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Abstract.  Carrier  heating  induced  suppression  of  band  filling  is  shown  as  a  fast  and  effective 
mechanism  of  optical  nonlinearity  leading  to  all-optical  bistability  in  degenerate 
semiccMiductors.  The  regenerative  pulsations  in  bistable  etalon  are  also  discussed  those  are 
possible  due  to  competition  between  carrier  generation  and  heating  influences  on  band  filling. 


1.  Introduction. 

Photonic  switching  in  III-V  semiconductors  in  the  spectral  range  near  the  fundamental 
absorption  edge  attracts  much  attention  because  of  its  great  capability  for  optical  computing 
and  data  processing  [1].  The  major  characteristic  of  the  related  digital  systems  is  their  speed, 
which  is  mostly  determined  by  the  physical  mechanism  of  the  optical  nonlinearity  of  the  active 
medium.  Recently,  it  has  been  shown  that  suppression  of  band  filling  by  the  optically  induced 
carrier  heating  in  degenerate  semiconductor  leads  to  all-optical  switching  in  a  picosecond 
time-scale  [2,3].  Herein  we  study  this  mechanism  of  nonlinearity  and  related  optical  bistability 
as  applied  to  both  bulk  and  multiple  quantum  well  (MQW)  GalnAsAnP  structures  and  also 
discuss  the  possibility  of  bistable  Fabry-Perot  etalon  to  selfpulsate. 


2.  Mechanism  of  nonlinearity 

Consider  the  degenerate  n-type  semiconductor  (or  MQW  structure)  with  thermalized 
electrons  having  their  Fermi  quasilevel  well  above  the  conduction  band  (or  ground 
subband)  bottom.  For  the  energy  states  below  the  probability  of  direct  interband 
transitions  is  reduced  by  band  filling  and  so  the  fundamental  absorption  edge,  the  sharper  the 
lower  carrier  temperature,  is  effectively  shifted  towards  the  higher  photon  energies  [4]. 
However,  any  increase  in  the  electron  effective  temperature  leads  to  more  occupation  of 
the  high-energy  states  and  so  suppresses  the  band  filling  and  smears  out  the  fundamental 
absorption  edge.  As  a  result,  for  the  spectral  range  corresponding  to  the  direct  transitions 
below  the  Fermi  quasilevel  the  interband  absorption  coefficient  increases  with 
effective  temperature  as  well  as  the  related  carrier  contribution  to  the  band  edge  refractive 
index  is  influenced  by  . 
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Fig.  1  TTic  influicnce  of  an  effective  eIa:tron  temperature  on  the  interband  absorption  coefficient  (a)  and 
change  in  propagation  constant  (b)  sj^tra  of  a  bulk  (^)  0.47 MQW 

structures  with  well  width;  a  =  80 A  (2),  65 A  (3)  and  period  d  =  300A.  Carrier  concentration  (reduced  to  a 
bulk  value)  is  equal  to  4.10^®  cm'^  throughout. 

These  effects  are  clearly  seen  from  the  Fig.  1  showing  the  spectra  of  the  interband 
absoiption  coefficient  and  related  carrier  heating  induced  change  in  propagation 

constant,  -co^),  computed  for 

both  bulk  and  MQW  semiconductors.  The  applied  two-band  model  does  not  take  into 
account  the  Coulomb  electron-hole  interaction  because  of  the  actual  photon  energies  are  well 
above  the  excitonic  absorption  region.  In  the  case  of  MQW  structure  no  coupling  between 
separate  quantum  wells  is  assumed,  and  the  absorption  coefficient  is  defined  as 
a^v  =Ycv  /  where  d  is  period  of  the  structure,  and  is  probability  of  the  interband 
absorption  related  to  a  single  quantum  well.  The  normalization  constant  proportional  to  the 
square  of  the  momentum  matrix  element  of  a  bulk  material  has  been  extracted  from  Ref.  [5]. 
Note,  that  for  MQW  structures  the  values  of  a^v  above  the  interband  absorption  edge  as  well 
as  the  peak  absolute  values  of  Sp  are  significantly  less  than  for  a  bulk  semiconductor.  This 
is  due  to  lower  density  of  states  in  a  MQW  structure  as  compared  to  a  bulk  material.  The 
reduction  in  a^y  directly  above  the  fundamental  absorption  edge  is  estimated  as 

m^j  ) ,  where  and  are  the  electron  and  reduced  effective 

masses,  respectively,  and  k^p  =  jh.  Under  the  strong  degeneracy  conditions  this 

factor  is  quite  small  and  therefore  the  same  variations  of  5pcv  require  for  the  larger  change  in 
Tg  in  the  case  of  a  MQW  structure  compared  to  a  case  of  a  bulk  semiconductor.  That  is  why 
we  restrict  herein  our  further  consideration  by  the  bulk  semiconductors  only. 

Temperature  dependencies  of  a^y  and  Sp^^y  lead  to  absorptive  and  dispersive  optical 
nonlinearities,  respectively,  provided  the  carrier  heating  is  caused  by  an  optical  excitation.  In 
such  a  case  for  any  time-scale  beyond  the  very  short  stage  of  carrier  thcrmalization  the 
optical  response  can  be  treated  as  linear,  but  all  the  material  parameters  are  assumed  to 
depend  parametrically  on  T^ .  Regarding  the  possible  mechanisms  of  a  photoinduced  carrier 
heating  in  the  narrow-gap  semiconductors  under  consideration,  the  most  important  ones  arc: 
intraband  absorption,  generation  of  energetic  carriers,  and  Auger  recombination  [2,3] 
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3.  Model  of  the  active  medium 

To  describe  the  carrier  heating  in  optically  excited  semiconductor,  the  processes  of  light- 
carrier  interactions  and  nonequilibiium  carrier  relaxation  should  be  considered  in  self- 
consistent  manner.  Our  approach  as  concerning  to  a  bulk  semiconductor  is  based  on  the 
following  assumptions:  (1)  Electrons  are  thermalized  i.e.  completely  described  by  their 
concentration  N  ^  and  effective  temperature  Tg .  (2)  Heavy  holes,  contrary  to  light  electrons, 
remain  in  the  equilibrium,  and  therefore  are  described  by  only  concentration  N  The  last  is 
rather  small  compared  to  N  g  but  should  be  taken  into  account  because  of  the  high  sensitivity 
of  the  interband  absorption  to  changes  in  N (3)  Any  nonequilibrium  state  is  quasineutral, 
i.e.  Ng=No  +  N^,  where  No  is  the  concentration  of  ionized  impurities.  (4)  Light 
absorption  within  the  actual  spectral  range  results  from  the  both  direct  interband  and  inderect 
(impurity,  hole,  electron  and  LO-phonon  assisted)  intraband  transitions.  (5)  The 
nonequilibrium  carrier  recombination  is  due  to  CHCC  Auger  process.  (6)  Electron  to  lattice 
energy  transfer  is  governed  by  the  inelastic  LO-phonon  scattering  which  is  restricted  by  the 
phonon  bottleneck  effect  i.e,  hot  phonon  effects  also  are  taken  into  account.  Then,  the 
description  of  the  photoexcited  semiconductor  is  reduced  to  a  couple  of  the  diffusion 
equations  written  for  the  hole  concentration  N  ^  and  electron  temperature  Tg.  Fig.  2  shows 
the  parametrical  dependence  of  the  total  absorption  coefficient  a  on  the  local  light  intensity 
3  related  to  their  steady  state  homogeneous  solution.  It  can  be  seen,  that  the  effective 
temperature  is  a  S-shaped  function  of  the  local  light  intensity  in  a  spectral  range 
corresponding  to  direct  transitions  below  the  Fermi  quasilevel,  that  is  due  to  rapid  growth  of 
a  with  Tg.  The  multivalued  shape  of  the  steady  state  homogeneous  T^(3)  curve  is  the 
necessary  condition  for  the  optical  bistability  due  to  increasing  absorption,  has  been  discussed 
previously  [2,3],  The  true  steady  state  distributions  of  carrier  parameters  and  thus  the  real 
possibility  of  the  bistable  response  and  related  switching  phenomena  depend  on  the  relation 
between  the  hole  diffusion  and  electron  thermal  conductivity  lengths,  i  and  ij^  the  values 

of  absorption  and  propagation  constants,  a  “^and  p  and  also  the  size  of  active  region  £  . 


1.0  1.5  2.0  2.5  3.0  3.5  4.0 
Normalized  electron  temperature 


Fig.  2  Nonlinear  absorption  characteristics  of  a  bulk  GaQ^^InQ^^As:  absorption  coefficient  vs  electron 
temperature  (a)  and  electron  temperatute  vs  local  optical  power  (b). 
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4.  Nonlinear  response  of  the  stationary  excited  Fabry-Perot  etalon. 

If  the  absorption  is  small  due  to  the  band  filling,  the  conditions  oc^  « 1 «  (5/  seem  to  be 
quite  reasonable.  Assuming  also  the  inequalities  1  <  p/  j  «  we  can  treat  the  carrier 
parameters  distribution  as  unifcmi  and  use  the  mean-field  approximation  for  optical  power 
within  the  nonlinear  Fabry-Perot  etalon  formed  by  a  semiconductor  layer.  Hence,  the 
description  of  nonlinear  response  reduces  to  a  couple  of  rate  equations  written  for  a  hole 
concentration  N  ^  and  electron  temperature  .  Then,  usually  cooling  time  Zj  is  as  small  as 
=  1  ps  (even  if  being  lagged  by  a  phonon  bottleneck  effect),  while  the  recombination  time  x  ^ 
is  anyway  more  than  1  ns.  Such  a  great  difference  makes  it  possible  to  distinguish  the  fast  and 
slow  stages  in  the  dynamics  of  the  photoexcited  semiconductor,  governed  by  the  relaxation  of 
Tg  and  N  respectively.  The  fast,  within  a  time  scale  Xj.  <t  <x  response  of  a  nonlinear 

etalon  is  shown  in  Fig.  3,  displaying  the  bistability  of  the  both  and 

characteristics.  But,  in  a  time  scale  more  than  x  ^  such  a  response  under  the  stationary 
excitation  may  be  unstable  due  to  the  competition  between  carrier  generation  and  heating 
influences  on  band  filling.  This  instability  and  related  regenerative  pulsation  are  of  the  same 
nature  as  ones  pointed  by  McCall  [6]  and  later  observed  by  Mackenzie  e.a.  [7]  in 
semiconductor  etalon  with  competing  electronic  (as  fast)  and  lattice  (as  slow)  dispersive 
nonlinearitics.  TTie  only  one  difference  is  that  in  our  case  both  competing  nonlinearities  are 
electronic  (caused  by  carrier  concentration  and  temperature  influences  on  tmnd  filling),  but 
that  is  why  period  of  selfpulsations  (determinined  by  x  yy )  may  be  reduced  to  a  few  ns. 


Fig.  3  The  bistable  steady  state  characteristics  of  the  asymmetric  Fabry-Perot  etalon  formed  by  the  uniftMin 
layer  of  a  bulk  Gflo.47f«o.53^‘^-  electron  temperature  (a)  and  reflected  power  (b)  vs  incident  power 
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Abstract  A  series  of  all-optical  devices  based  on  nonlinear  excited-state  absorption 
working  at  non-resonant  frequency  are  proposed.  Experimental  and  theoretical  results 
obtained  with  C50  and  metal-organic  materials  using  ns  and  ps  laser  at  532  nm  are 
presented  in  this  paper. 

All-optical  devices  are  necessary  for  the  future  optical  communication  and  optical  computing. 
Although  semiconductor  devices  using  resonant  nonlinearity  usually  exhibit  large  nonlinear 
refractive  index,  their  large  linear  absorption  may  cause  low  device  throughput  and  undesirable 
thermal  effects.  Here  we  demonstrate  a  new  kind  of  devices  based  on  the  excited-state  nonlinear 
absorption  working  at  non-resonant  wavelength  using  organic  materials,  which  are  advantageous 
for  their  low  linear  absorption,  fast  response  and  mirrorless  structure, 

1.  Excited-State  Nonlinear  Absorption 

I  The  reverse  saturable  absorption  (RSA)  of  C60  and  copper  phthalocyanine  (CuPc)  have  been 
investigated.  The  absorption  spectra  of  the  ground  state  and  the  differential  absorption  spectra  of 
excited  states  are  shown  in  Fig,  1(a)  for  and  Fig.  1(b)  for  CuPcf^L  One  can  see  that  their 

absorption  cross  section  of  the  ground-state  are  all  smaller  than  that  of  triplet  and  singlet  excited 
states  at  532  nm  wavelength,  so  the  RSA  can  be  observed. 


(a)  (b) 

Fig.  1.  Absorption  spectra  of  the  ground  state  and  the  excited  states  for  C5Q  (a)  and  CuPc  (b). 


The  energy-level  diagram  for  the  metallo-organic  compounds  or  C60  is  shown  in  Fig. 2,  where 
Sq  is  the  ground-state  electronic  level;  and  S2  are  the  first  and  second  singlet  excited-state 
electronic  level,  respectively;  Ti  and  T2  are  the  first  and  second  triplet  excited-state  electronic 
level,  respectively.  There  are  many  vibronic  sub-levels  above  each  electronic  level. 
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Fig.2  Energy-level  diagram  for  C^q  and  metallo-organic  materials. 


The  molecules  in  Sq,  Sj  and  Ti  simultaneously  absorb  the  photons  with  same  frequency  v  with 
absorption  crass  section  gq,  aj^and  and  transit  to  Sjy,  S2V  ^uid  T2Vj  respectively.  They 
rapidly  relax  from  there  to  S2  and  T2,  then  nonradiationally  relax  down  to  next  lower  levels 
with  transition  probability  kjQ,  k2i  and  kjQ,  respectively.  Because  the  lifetimes  of  S^y, 
S2V,  T2y  and  S2,  T2  are  very  short  (<ps),  the  populations  in  these  levels  can  be  neglected.  The 
rate  equations  for  describing  time- variation  of  populations  nQ,  n|,  n j  in  Sq,  Sj,  Tj,  and  a  light- 
propagation  equation  for  describing  the  variation  of  photonic  flux  (f)  along  z  direction  in  the  sample 
can  therefore  be  written  as: 


(1) 

(2) 

N  =  nQ  +/7j. 

(3) 

di>ia  =  -(Gorio  +  (j,n,  -f  cXjn^)^ 

(4) 

where  N  is  the  total  population,  <p=Ithy.  Assume  the  intensity  of  incident  pulsed  light  is  a 
Gaussian  temporal  function, 


l  =  l{t,z)=l^{z)e 


(5) 


here  1^  (z)  is  the  peak  intensity  at  z,  At  is  the  pulse  width  of  laser,  and  c  is  a  normalized  constant. 
Using  Eq.(l)-Eq.(5),  the  curve  of  energy  transmittance  T=F(L)/F(0)  versus  the  incident  fluence 
F(0),  namely  RSA  characteristics,  can  be  calculated.  The  curves  of  T-F(O)  have  been  simulated  by 
calculating  numerically,  which  are  well  agree  with  the  experimental  results  as  shown  in  Fig.  3. 


Fig.  3.  Experiments  data  and  simulations  of  RSA  for  solutions  of  C^q  (a)  and  CuPc  (b)  with  ns/ps  laser  pulses. 
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2.  Photonic  Devices  Based  on  Excited-State  Nonlinear  Absorption 

The  transient  absorptive  optical  bistability  was  experimentally  obtained  using  a  CuPc  solution 
and  laser  pulses  with  width  of  15  ns  at  532  nm.  The  comparison  between  calculated  result  (1)  and 
experimental  data  (2)  are  illustrated  in  Fig. 4. 
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Fig.  4.  Experiments  data  and  simulations  of  transient  optical  bistability  in  CuPc  solution. 

The  experimental  data  of  optical  limiting,  i.e.  output  fluence  F(L)  versus  input  fluence  F(0) 
were  obtained  using  C^q  and  CuPc  solution  and  a  8  ns  pulsed  laser  at  532  nm.  The  simulations  are 
also  consistent  with  experimental  data  as  shown  in  Fig. 5. 
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Fig.  5  Experiments  data  and  simulations  of  optical  limiting  in  CuPc  and  C50  solutions. 

When  a  NdiYAG  15  ns  pulsed  laser  with  Iq  =  3x10^  W/cm^  at  532  nm  pumps  a  C50  solution 
solution,  a  large  population  can  be  accumulated  in  Tj^.  If  in  the  same  time  a  cw  laser  diode  beam 
at  747nm  absorption  peak  wavelength  of  excited  state  Ti  passes  through  the  €50  solution.  Photons 
will  be  intensively  absorbed  by  molecules  in  Tj.  So  the  output  intensity  switches  off.  This  all- 
optical  switching  process  (photograph)  is  shown  in  Fig. 6(a).  The  simulation  can  be  made  by  using 
Eq.(l)-(3)  for  pump  beam  and  Eq.(4)  for  probe  beam  as  shown  in  Fig. 6(b).  The  switch-on  time 
depends  on  the  incident  laser  pulse-width  about  15ns;  the  switch-off  time  mainly  depends  on  the 
relaxation  time  of  T^  about  300  ns. 


(a)  (b) 

Fig.6  All-optical  switching  in  050-  (a)  Experimental  photograph,  (b)  Simulation  curve. 

Fig. 7(a)  gives  the  experimental  and  simulated  results  of  optical  modulation,  i.e.  relative 
transmission  of  probe  beam  T/Tq  versus  peak  pumping  intensity,  Tq  is  the  transmission  of  probe 
beam  without  the  pump  beam  Iq.  A  design  for  "Exclusive  AND"  and  "Exclusive  OR"  logic  gates 
based  on  the  modulation  characteristics  at  different  working  points  is  shown  in  Fig. 7(b). 


Fig. 7  The  modulation  characteristics  (a)  and  A  design  of  exclusive  "and”  and  "or"  logic  gates  (b). 

The  excited-state  photonic  devices  (ESPD)  have  many  advantages:  with  fast  response  time  (ns  to 
sub-ps);  small  total  linear  absorption  coefficient;  small  working  power  for  probe  beam,  and 
mirrorless  structure.  So  this  kind  devices  promise  to  be  use  in  the  all-optical  computing,  the  all- 
optical  communication  and  the  laser  protection. 
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Abstract.  A  possibility  of  using  known  quasi-phase-matched  second-harmonic  generators  as 
main  components  of  an  all -optical  gate  capable  of  performing  logical  operations  on  WDM 
signals  is  considered.  Main  parameters  of  the  gate  are  estimated. 


I.  Introduction 

To  date,  the  wave-division-multiplexing  (WDM)  techniques  are  widely  used  allowing  one  to 
transmit  simultaneously  many  optical  signals  having  different  wavelengths  through  the  single 
optical  channel.  The  opportunities  of  such  an  approach  would  be  much  greater  if 
polychromatic  all-optical  gates  capable  of  direct  processing  of  WDM  signals  were  available. 
Now,  no  gates  of  such  kind  are  known  even  in  theory. 

One  should  note  that,  at  present,  there  are  also  no  all-optical  gates  for  processing  usual 
optical  signals  which  were  able  to  compete  with  electronic  gates  in  such  basic  parameters  as 
dissipated  power  and  switching  frequency.  Nevertheless,  carrying  out  careful  analysis,  one 
can  find  that,  to  date,  there  are  relatively  small-power  and  superfast  all-optical  devices  that 
could  be  used  as  the  main  nonlinear  component  of  the  WDM  gate  with  the  overall  rate  of 
signal  processing  near  the  maximal  bit-rate  in  a  short  optical  communication  line.  These  are 
integrated-optic  quasi-phase-matched  second-harmonic  generators  (SHG). 

A  simple  and  practical  method  is  described  for  constructing  WDM  gates  consisting  of 
SHGs  [1-3]  and  linear  wavelength  selective  couplers.  All  these  components  are  tested 
experimentally  in  many  optical  devices. 


2.  Operating  principles 

The  main  idea  of  the  gate  is  that  the  known  quasi-phase-matched  SHG  based  on  an 
integrated-optic  domain-inverted  waveguide  [1,2]  can  be  used  as  an  optical  parametric 
travelling  wave  amplifier  (OPTWA)  without  being  changed.  For  an  OPTWA,  the  condition 
for  quasi-phase-matching  (QPM)  can  be  written  as  follows: 

Ak=27t/A  (1), 


where 


Ak-(n(cOp)(Dp-n(cOs)cOs-n(cOi)cOi)/c 


(2), 
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n((jL))  is  the  effective  refractive  index  of  a  waveguide,  A  is  the  period  of  the  domain  structure 
in  an  SHG,  c  is  the  velocity  of  light  in  vacuum,  cOp,  cOg,  and  cOj  are  carriers  of  pump,  signal, 
and  idle  waves,  respectively,  and 

C0p=C0s+0)i  (3) 

In  the  particular  case  when  cOs=C0i=C0p/2  equation  (1)  is  valid  for  a  quasi-phase-matched 
SHG.  For  a  normal  dispersion  medium  ,  in  particular,  for  the  domain-inverted  waveguide 
n(co)  is  an  increasing  function  of  co. 

One  can  verify  the  following  two  statements  concerning  the  value  of  Ak  in  (1)  provided 
that  n((jL))  is  an  arbitrary  increasing  function: 

1.  If  COp,  cOg,  and  coj  increase  by  positive  value  Aco,  Aco/2,  and  Aco/2,  respectively,  then  Ak 
becomes  greater. 

2.  If  (Op  is  constant,  C0s>C0j,  and  the  difference  cOg-cOi  increases  by  positive  value  5co  then  Ak 
becomes  less. 

Thus,  for  each  Aco>0  we  can  find  such  a  value  of  6co>0  that  Ak  is  constant.  Actually, 
the  increment  in  Ak  caused  by  increase  in  cOp  of  Aco>0  can  be  compensated  by  a  decrease  in 
Ak  due  to  an  increase  in  the  value  of  cOg-cOj  by  5co>0.  In  other  words,  any  SHG  with 
C0s=c0p/2,  for  which  equation  (1)  is  satisfied,  can  be  used  as  an  OPTWA  with  other  values  of 
COp  and  CO5  for  which  (1)  is  also  valid. 

In  the  particular  case  when  n(co)  is  a  linear  increasing  function  and  when  for  C0p=c0po 
C0s^cOi^C0p()/2,  then  to  satisfy  quasi-phase-matching  condition  (1)  for  pump  wave  carrier 
greater  than  cOpo  signal  and  idle  wave  carriers  should  correspond  the  following  functions: 

“s=(®p+[®p^-®pO^]''^V2  (4), 

cor(«p-[cOp2-(Opo2]‘'2)/2  (5). 

Here  cOpo  is  the  carrier  of  the  pump  for  which  the  OPTWA  operates  in  the  degenerate  regime. 
The  same  OPTWA  can  operate  in  the  SHG-regime  to  obtain  the  second  harmonic  with  the 
carrier  cOpQ.  According  to  (4)  and  (5)  the  dependence  of  cOg  and  cOj  on  cOp  is  depicted  in 
figure  1 .  Because  values  cOg  and  co;  are  determined  uniquely  from  the  system  of  equations  (1)- 
(3),  for  C0p>C0pQ  each  value  of  cOp  corresponds  to  the  only  pair  of  values  co^  and  coj.  There  are 
no  any  waves  with  carriers  co^  and  co^  different  from  pointed  above  that  could  interact  with 
the  pump  COp. 


Fig.  1  Dependence  of  the  carriers  and  oj  of  signal  and  idle  waves  Fig.  2  Schematic  diagram  of  the  gale 
on  Ihc  carrier  Wp  of  the  pump  in  an  OPTWA  if  the  QPM 
condition  is  satisfied 
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If  several  pumps  with  different  carriers  enter  an  SHG  simultaneously  then  the  result  of 
the  change  in  the  refractive  index  n  is  equal  to  the  sum  of  the  effects  when  each  pump  enters 
an  SHG  separately.  The  character  of  modulation  performed  by  one  pump  does  not  depend  on 
presence  of  other  pumps.  Input  control  signals  with  carriers  (m=l,  interact  only 

with  corresponding  pumps  with  carriers  for  which  condition  (1)  is  valid. 

As  the  waves  propagate  along  a  lightguide,  the  interaction  mentioned  manifests  itself  in 
an  increase  in  the  intensity  of  the  waves  with  carriers  Og  and  coj  and  in  a  decrease  in  intensity 
of  the  wave  with  carrier  cOp  [4],  The  intensity  of  the  wave  with  carrier  cOp  falls  to  zero  after 
some  distance.  If  the  length  of  an  SHG  is  equal  to  this  distance,  there  is  no  wave  with  carrier 
COp  at  the  output  of  an  SHG.  This  situation  can  take  place  if  a  small  control  signal  A  with 
carrier  ©g  enters  an  SHG. 

3.  Gate  design 

A  schematic  diagram  of  the  proposed  gate  is  shown  in  figure  2.  Several  identical  SHGs  with 
the  periodical  domain-inverted  structures  [1,2]  are  connected  in  series.  Optical  power  is 
delivered  to  the  input  of  the  left  SHG  in  the  form  of  a  periodical  sequence  of  pulses  with  carrier 
frequency  ©pj^^.  Frequency  selective  directional  couplers  (DC)  are  used  both  for  entering 
logical  signals  y4ji^  (in  fig.  2  j=l,2,3)  with  carriers  CDs^<©pnj  and  for  extraction  of  result  signals 
with  the  same  carriers  from  the  SHG.  The  coefficient  H  of  cross-transmission  of  DCs  is 
close  to  1  for  signals  with  carrier  frequency  ©g^  and  to  0  for  signals  with  carrier  ©p^^  As  the 
localization  of  the  field  of  a  light  wave  in  the  cross-section  of  a  lightguide  decreases  for  lower 
frequencies,  coupling  indexes  for  waves  with  carriers  ©g  and  ©j  essentially  exceed  that  for  a 
wave  with  carrier  ©p.  In  this  case  the  usual  linear  DC  having  H=0  for  the  wave  with  carrier  ©p 
has  H=l  for  waves  with  carriers  ©g  and  ©j. 

We  can  see  that  the  signal  with  carrier  ©g  is  present  at  the  output  Bj  only  when  control 
signal  Ai  with  carrier  ©g  is  present  at  the  input,  that  is  the  output  signal  corresponds  to  logical 
fijnction  Analogously,  a  signal  with  carrier  ©g  is  present  at  the  output  B2  only  in  the 

absence  of  signal  Aj  and  the  presence  of  signal  A2,  i.e.  the  resulting  signal  at  output  B2 
corresponds  to  logical  function  1^2- 

Generally,  the  gate  have  iV  logical  inputs  A^,  A2,..  Aj^j  for  control  signals  and  the  same 
number  of  logical  outputs  for  result  signals  B^,  B2...Bj,j.  The  resulting  signals  correspond  to  the 
following  functions:  .5;=^;,  B2^A  jA2,  B^^A  jA  2A2... 


4  Main  parameters 

The  power  required  for  the  gate  can  be  evaluated  from  an  analysis  of  the  power  characteristics 
of  SHGs  which  are  the  main  components  of  the  gate  and  the  only  nonlinear  devices  whose 
operation  depends  on  the  intensity  of  light  signals.  Present  technology  makes  it  possible  to 
produce  SHG  with  a  power  of  tens  milliwatts.  Standard  GaAs/AlGaAs  lasers  can  be  used  as  a 
power  supply.  For  a  gate  the  power  of  the  input  control  signal  can  be  lO^-lO^  times  less  than 
that  of  the  pump,  that  is,  less  than  power  of  the  other  known  waveguide  all-optical  gates  by  a 
factor  of  103. 

With  regard  the  rate  of  processing  optical  pulses,  the  greatest  restriction  is  imposed  by 
the  difference  in  group  velocities  of  pulses  with  carriers  ©p  and  ©^  in  an  SHG.  Assuming  that 
the  difference  in  velocities  is  equal  to  8%  [5]  and  the  length  of  an  SHG  is  -Icm,  we  obtain  that 
the  pulse  walk-off  is  10  ps  and  pulsewidth  must  be  greater,  e.g.  20  ps.  Besides,  minima! 
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pulsewidth  x  is  limited  by  the  bandwidth  of  an  amplifier.  Bandwidth  AF  of  an  SHG  having  a 
length  of  about  1  cm  is  0.2  nm  [1],  or  Hz.  As  x=l/AF=20  ps,  the  maximal  rate  of  the 

pulses  is  about  25  GHz.  This  is  essentially  less  than  the  bandwidth  of  the  optical  transmission 
channels,  but  the  gate  can  process  M  sets  of  signals  simultaneously  and  the  maximal  rate  of  the 
pulses  is  shown  to  be  A/ times  higher. 

The  maximal  value  of  Mis  determined  by  the  expression  M=W/Wi,  where  is  the  total 
bandwidth  for  the  various  pumps  determined  by  the  frequency  capabilities  of  the  nonlinear 
material,  waveguides,  and  couplers,  Wy  is  the  bandwidth  for  a  single  pump.  For  example,  if  Wy 
is  equal  to  10%  of  carrier  a)|,  coi^bTr^lO^"^  s'^,  1F|=50  GHz  then  M=600.  Naturally,  the  total 
power  of  a  pump  also  increases  by  M  times.  Thus,  taking  into  consideration  the  whole 
bandwidth  of  WDM  signals  and  possibility  of  simultaneous  processing  M  sets  of  signals,  we 
can  see  that  the  number  of  logical  operations  performed  by  the  gate  is  comparable  with  the 
maximal  number  of  bits  transmitted  through  the  optical  communication  lines. 

Contrast  ratio  k  of  the  gate  is  equal  to  the  ratio  of  the  output  result  signals  corresponding 
to  logical  1  and  0.  When  the  pump  is  present  and  the  input  signals  are  amplified  by  y  dB,  the 
output  signal  corresponds  to  logical  7.  When  the  pump  is  absent  and  the  input  signals  pass  to 
the  output  with  an  attenuation  of  a  dB,  the  signal  corresponds  to  logical  0.  The  value  of  a  is 
0.5  -  1.0  dB/cm,  K^y+a,  and  the  value  of  y  is  restricted  only  by  the  level  of  noise  and  the 
parasitic  signals  reflected  from  the  outputs  and  can  exceed  20  dB.  Thus,  the  gate  has 
satisfactory  parameters  from  the  viewpoint  of  a  circuit  designer. 


5.  Applications 

It  appears  likely  that  the  principal  application  of  the  gate  described  here  will  be  in  WDM 
systems.  Another  application  of  the  gate  is  its  use  in  all-optical  supercomputers.  Such  a  gate 
enables  an  implementation  of  a  polychromatic  computer  comprising  a  set  of  N 
monochromatic  supercomputers  working  concurrently.  Each  computer  uses  common  optical 
hardware  (gates  and  transmission  lines)  and  operates  in  its  own  wavelength  independently  on 
the  others.  In  this  case  total  performance  is  increased  N  times  without  the  hardware  being 
enlarged. 

6.  Conclusion 

The  analysis  performed  shows  that  the  present  state  of  the  art  for  creatiing  optical  quasi¬ 
phase-matched  second-  harmonic  generators  makes  it  possible  to  construct  an  all-optical  gate 
without  essential  additional  investigations.  The  gate  can  process  several  sets  of  optical  pulses 
simultaneously,  has  satisfactory  parameters  from  the  viewpoint  of  circuit  designer,  and  is 
significantly  superior  to  any  other  available  in  operating  rate  and  consumption  power. 
Moreover,  no  new  materials  or  technologies  are  required  for  its  fabrication. 
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Diffraction  kinetics  of  electronic  and  thermal  transient 
gratings  in  GaAs  epilayers 
and  GaAs/GaAlAs  multi-quantum  wells 
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Abstract: 

Diffraction  kinetics  of  transient  gratings  in  GaAs  epilayers  and  GaAs/GaAlAs  multi-quantum 
wells  at  room  temperature  are  reported.  Samples  are  prepared  by  lift-off  epitaxy.  The  results  are 
interpreted  in  terms  of  time  separation  of  the  electronic  and  thermal  contributions. 

1.  Introduction 

The  aim  of  this  work  was  to  study  the  subnanosecond  diffraction  capabilities  of 
GaAs  epilayers  and  GaAs/GaAlAs  multi-quantum  wells  at  room  temperature,  using  a 
forward  geometry.  The  epitaxial  structures  were  grafted  onto  glass  slides  by  means  of 
the  lift-off  epitaxy  technique.  Diffraction  kinetics  of  transient  gratings  were  measured 
using  the  first  order  diffraction  in  the  Raman-Nath  configuration. 

2.  Experimental  set-up 

Pulses  at  the  wavelength  0.532  p,m  were  generated  by  frequency  doubling  in  KDP 
crystals  the  fundamental  harmonic  of  an  actively  mode-locked  Nd;YAG  laser.  The 
duration  of  the  pulses  was  approximately  30  ps.  These  pulses  synchronously  pump  a  dye 
laser  operating  in  the  range  (825-850  nm).  The  duration  of  the  pulses  was  approximately 
15-20  ps.  Two  pump  pulses  (PI  and  P2)  interacted  inside  the  sample  to  produce 
transient  gratings. The  gratings  were  then  read  by  a  probe  pulse  (S)  generated  by  the  dye 
laser  and  directed  in  normal  incidence  to  the  sample  surface  (Raman-Nath  configuration). 
The  probe  pulse  was  delayed  with  respect  to  the  pump  pulses  PI  and  P2.  The  pump  and 
probe  beam  sections  were  limited  by  a  circular  diaphragm  stuck  on  the  sample.  The 
diameter  of  the  hole  was  1mm.  We  measured  the  energy  of  the  first  order  diffracted 
pulse  versus  the  probe  delay.  In  a  first  set  of  experiments,  gratings  were  generated  at  the 
wavelength  532  nm  and  probed  in  the  infrared.  The  external  angle  between  the  two 
pump  beams  was  20  =  0.1  radian,  corresponding  to  a  grating  period  A  of  5.3  ^m.  A 
typical  diffraction  efficiency  kinetics  is  shown  in  Fig.  1.  Then,  in  a  second  set  of 
experiments,  we  use  degenerate  pumps  and  probe  in  the  infrared.  The  external  angle  was 
20  =0. 18  radian  and  A  =  4.6  |im.  The  diffraction  efficiency  kinetics  are  reported  in  Fig  2. 

3.  Structure  preparation 

Active  multi-quantum  well  epitaxial  structures  on  transparent  substrates  were 
achieved  in  two  steps.  First,  the  structures  were  grown  on  GaAs  substrates  by  molecular 
beam  epitaxy.  They  consist  of  80  periods  of  alternating  GaAs  wells  and  (Ga,Al)As 
barriers,  whose  thicknesses  are  9.6  nm  and  13  nm  respectively.  The  quantum  wells  are 
sandwiched  between  two  0,2  p,m  thick  Gaq  7AIQ  3AS  layers.  Second,  the  epitaxial 
structures  were  removed  from  their  substrates  and  grafted  onto  glass  slides  by  means  of 
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the  lift-off  epitaxy  technique.  The  structures  areas  were  around  4  mm^.  The  optical 
transmission  spectra  of  the  structures  was  recorded.  The  absorption  threshold  of  the 
structure  were  around  850  nm  at  room  temperature  (excitonic  resonance). 


4.  Experimental  results  with  gratings  generated  at  532  nm 


Probe  delay  (ps)  Probe  delay  (ps) 

Fig.  1:  Diffraction  kinetics  in  the  first  order  Raman-Nath  configuration  measured  on  multi¬ 
quantum  wells  GaAs/GaAlAs.  The  pump  pulse  energies  for  PI  and  P2  are  respectively  4  mJ/cm^ 
and  2  mj/cm^  .  In  the  left  curve,  the  time  axis  has  been  expanded. 

The  diffraction  kinetics  exhibit  a  peak  during  the  first  200  ps  and  then  a  minimum 
within  the  interval  (200-300  ps).  The  diffracted  pulse  is  rather  unstable  in  this  late 
interval,  then  each  reported  data  point  has  been  averaged  over  20  pulses.  After  300  ps 
the  energy  of  the  diffracted  pulse  rises  up  to  a  stable  value.  We  interpret  these  results  in 
the  following  manner: 

The  peak  in  the  diffraction  kinetics  of  Fig.  1  is  due  to  an  electronic  process.  The 
presence  of  the  electron-hole  plasma  induces  a  decrease  of  the  optical  index  in  the 
illuminated  fringes.  Moreover,  the  crystal  is  heated  by  carrier  relaxation  and  non  radiative 
recombination,  which  results  an  increase  of  the  optical  index  in  the  illuminated  fringes. 
The  decay  times  of  these  two  opposite  contributions  are  very  different.  A  change  of  sign 
of  the  optical  index  variation  takes  place  during  the  interval  (200-300  ps).  This  transient 
extinction  of  the  index  grating  results  in  a  pronounced  minimum  observed  in  the 
diffraction  kinetics.  In  the  following,  we  will  briefly  discuss  the  order  of  magnitude  of  the 
electronic  and  thermal  contributions  to  the  diffraction  efficiency  in  order  to  test  the 
coherence  of  this  interpretation.  The  absorption  coefficient  in  Gao  jAIq  jAs  ^ 

around  6x10^  cm‘l  [1].  Then,  we  will  consider  here  that  only  the  0.2  }im  thick 
Gao  7^0  layer  is  excited.  The  plasma  density  generated  in  this  layer  during  the 
picosecond  excitation  is  limited  by  Auger  recombination.  We  estimate  the  plasma  density 
at  p=5xl0l^  cm"^  in  the  illuminated  fringes  (taking  the  Auger  coefficient  equal  to  3x10"^^ 
cm^  s‘l).  The  plasma  induced  optical  index  variation  can  be  written  [2]: 

— F  2  2'  P 

^  2no  [cOg-CD^ 

where  n^  is  the  optical  index,  is  the  reduced  mass,  CO  and  cOg  are  the  angular 
frequencies  of  light  and  bandgap  respectively.  At  the  peak  of  the  curve  in  Fig.l,  the 
plasma  contribution  to  the  diffracted  signal  is  dominant,  then  the  above-mentioned 


formula  can  be  used  to  estimate  the  optical  index  variation.  We  obtain  a  maximum  index 
variation  of  0.1  to  0.2.  The  other  contribution  to  the  optical  index  variation  arises  from 
temperature  rise  AT  dynamics  which  is  governed  by; 
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where  is  the  heat  diffusion  coefficient,  Cp  is  the  specific  heat  capacity,  M  is  the 
crystal  density  and  p(t)  is  the  time  dependent  electron-hole  plasma  density.  The 
recombination  of  the  carriers  occurs  with  a  time  constant  T  and  releases  a  quantity  of 
energy  Ep  per  electron-hole  pair  to  the  lattice.  For  probe  delay  above  400  ps,  the  grating 
decay  time  Tj  is  only  governed  by  heat  diffusion,  and  is  given  by: 


where  Kj  is  the  thermal  conductivity.  Assuming  a  total  conversion  of  optical  into 
thermal  energy,  the  initial  temperature  rise  is  estimated  to  AT  =  a  Eqp^  /  Mcp  and  the 
optical  index  rise  reads:  Anj  =  (dn/dT)  AT.  For  numerical  calculations  we  use:  n^  ^  3.4 
at  824  nm  and  n^  =  3.87  at  532  nm  [1],  the  electron  effective  mass  m^  =  0.092  [3],  the 
hole  effective  mass  m^  =  0.5  [4],  the  bandgap  Eg  =  1.8  eV,  =  0.122  J  s‘l  cm’^K"!,  the 
specific  heat  capacity  Cp  =  0.364  J  g"l  K"^,  M  =  4.88  g  cm‘^,  and  dn/dT  =  1.9  10"“^  [5]. 

The  first  order  diffraction  efficiency  in  the  Raman-Nath  regime  takes  the  following 
forms [6],  for  the  peak  value  and  the  plateau  value  respectively: 


+‘^T)d1  ^r27l(Anp  +  Anjld 


27tAii'j’d 


2^  27CAn-j'd 


ffplateau  ^  ^  ^  J 

Where  X  is  the  probe  wavelength,  Jj  is  the  Bessel  function  of  the  first  order,  d  is 
the  grating  thickness,  and  A  is  a  constant.  Then,  we  obtain  the  following  ratio; 


^peak 

plateau 


Anj  +  Allp 


We  have  neglected  here  the  contribution  of  the  absorption  gratings.  This  is 
supported  by  the  fact  that  the  probe  transmission  modulation  is  lower  than  2%,  therefore 
the  calculated  diffraction  efficiency  is  lower  than  1 0'^,  which  is  negligible  with  respect  to 
the  measured  value,  around  10"^.  Using  the  above-mentioned  parameters,  except  for  the 
Auger  coefficient  which  is  taken  between  10"^^  cm^  s"^  and  10'^^  cm^  s"^,  the  numerical 
calculation  gives; 

3  peak^  ^plateau  ■ 

This  result  is  in  reasonable  agreement  with  the  experimental  data  (see  Fig.l).  In 
Fig.  1,  the  transition  between  the  electronic  and  thermal  gratings  has  been  clearly 
observed  because  of  the  high  density  and  temperature  of  carriers  generated  in  the 
Gao  7AI0  3AS  crystal.  In  the  following  experiments,  we  will  use  excitation  wavelengths 
close  to  the  bandgap,  then  Ep  and  AT  will  decrease. 
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5.  Diffraction  results  in  degenerate  pump-probe  experiments 


Fig.  2:  Diffraction  kinetics  in  the  first  order  Raman-Nath  configuration  measured  on  multi¬ 
quantum  wells  GaAs/GaAlAs.  Left:  the  pulse  energy  for  PI  ,  P2  and  the  probe  S  were  respectively 
6,7  |iJ’  4.6  pj  and  0.23  pJ.  Right:  the  wavelength  was  X  =  815  nm;  the  diffraction  efficiency 
decreases  with  the  probe  pulse  energy. 

In  these  experiments,  pump  and  probe  beams  are  produced  by  the  same  picosecond 
dye  laser.  The  total  pump  energy  of  the  order  of  1  mj/  cm^.  The  diffraction  efficiencies 
are  defined  with  respect  to  the  transmited  pulse.  The  Fig.2  (Left)  shows  the  diffraction 
efficiency  kinetics  for  X  =  834  nm.  This  wavelength  corresponds  to  a  rather 
homogeneous  excitation  of  the  wells.  The  diffracted  signal  decay  time  is  170  ps, 
corresponding  to  a  grating  lifetime  in  the  structure  of  340  ps.  Taking  into  account  the 
ambipolar  diffusion,  we  estimate  a  carrier  lifetime  greater  than  500  ps.  We  show  in  Fig.  2 
(Right)  two  kinetics  at  815  nm.  In  this  case,  the  excitation  of  the  wells  is  more 
inhomogeneous  than  at  834  nm.  The  main  difference  between  the  two  kinetics  lies  in  the 
values  of  the  probe  pulse  energy.  In  the  experiment  reported  in  the  upper  curve,  the 
probe  energy  is  lower  than  the  pump  energy  while  in  the  case  of  the  lower  curve,  pump 
and  probe  energies  are  of  the  same  order  of  magnitude.  In  the  last  case,  a  partial  grating 
erasure  by  the  probe  pulse  is  probable.  The  maximum  diffraction  efficiency  is  roughly  an 
order  of  magnitude  greater  at  834  nm  than  at  815  nm  (  due  to  the  influence  of  the 
excitonic  resonance  at  850  nm).  No  thermal  gratings  have  been  detected  in  these 
degenerate  experiments,  in  contrast  with  the  previous  results  (Fig.  1).  This  is  due  to  the 
decrease  of  the  pump  pulse  energy  used,  and  to  the  improvement  of  the  excitation 
homogeneity.  To  conclude,  the  good  reliability  of  the  lift-off  epitaxy  technique  allows  its 
use  for  all-optical  epitaxial  device  engineering. 
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Abstract  Since  we  expect  future  photonic  networks  to  use  wavelength  multiplexed  technologies, 
we  designed  two  key  components  (wavelength  conversion  and  filtering  eomponents)  using 
commercially  available  popular  technologies.  Design  parameters  and  design  results  are  introduced. 


1.  Introduction 

From  the  perspective  of  a  photonic  switching  network  system  designer  ,  the  first  step  in  the 
implementation  of  photonic  networks  should  be  the  introduction  of  frequency  multiplexing 
technologies  into  subscriber  access  fiber  lines.  Figure  1  illustrates  this  idea.  Frequency 
multiplexing  over  a  subscriber- 
access  fiber  line  facilitates 
transporting  a  variety  of  media 
signals  over  a  single  fiber.  By 
making  line  usage  more  efficient, 
this  method  encourages  fiber 
installation,  which  is 
indispensable  for  future  fiber 
networks. 


If  this  proposal  is  accepted,  the 
next  step  would  be  the 
introduction  of  a  switching 
system  based  on  frequency 
multiplexing,  such  as  the  one 
shown  in  Fig.  2.  Since  the  switching  matrix  has  a  very  large  traffic  capacity,  it  should  use  time- 
division  multiplexing(TDM)  technology. 


Figure  1.  Shared  Usage  of  a  Subscriber  Access  Fiber. 
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-1  functions 

in  this  system  are 
wavelength 
conversion, 
filtering,  and 
coupling. 

As  a  system 
designer,  I  tried  to 
find  devices  that 
perform  these 
functions,  but  it  is 

quite  difficult  to  get  a  number  of  components  that  can  be  assembled  as  part  of  a  switching 
system.  Most  research-level  devices  are  not  yet  practical  in  terms  of  cost,  reliability,  and 
availability.  I  therefore  designed  two  of  these  components:  a  wavelength  converter  and  filters. 
The  purpose  of  this  design  exercise  was  to  determine  the  availability  of  existing  commercial 
technologies.  Of  course,  they  will  be  improvd  as  new  device  technologies  become  available.  The 
purpose  of  this  work  was  to  give  device  researchers  an  example  of  the  design  parameters  that 
photonic  switching  system  designers  may  want  the  future. 


Figure  2.  Experimental  Switching  Fabrk  Design. 


The  system  design  parameters  are  as  follows.  The  values  were  selected  mostly  based  on  my 
knowledge  and  experience.  The  main  objective  was  to  find  a  set  of  values  that  can  be 
implemented  as  an  experimental  fiber  access  line  and  a  switching  system  based  on  frequency 
multiplexing.  Therefore,  an  economical  component  design  was  the  first  priority. 


(a)  Wavelength  band:  1 100  -  1600  nm 

(b)  Number  of  multiplexed  waves:  8 

(c)  Minimum  adjacent  wavelength  separation:  20  nm 

(d)  Fiber  mode:  multi-mode  (50-micron  core  diameter) 

(e)  Maximum  signal  speed:  622  Mbps  (>1  GHz) 

(f)  Transmission  distance:  5-10  km 

The  selected  laserdiode(LD)  wavelengths  were  1 175-1 190, 1210-1225, 1275-1285, 
1305-1310,  1330-1340,  1505-1525,  1545-1550,  and  1570-1580  nm. 


2.  Wavelength  Conversion  Component 

This  component  converts  a  signal  of  one  wavelength  into  another  wavelength  with  the  control 
of  wavelength  selection  signal.  The  incoming  fiber  signal  is  received  by  a  photo  detector  (PD), 
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the  electronic  signal  is  amplified,  an  LD  driver  activates  the  appropriate  LD  (out  of  eight),  and 
then  the  output  light  is  guided  to  an  output  fiber  through  a  coupling  device. 

(a)  Input  signal  level:  >-35  dBm  (applied  InGaAs-APD) 

(b)  Electronic  amp.  gain:  >40  dB 

(c)  Modulation  transparency:  digital/analog  (by  bias  selection) 

(d)  LD  output  signal  level:  0  to  3  dBm  (using  a  commercial  Fabry-Perot  resonator-type  LD 
bare  chip) 

(f)  Fiber  output  signal  level:  -10  to  -15  dBm 

(g)  Signal  selection  port:  TTL  level  (one  out  of  eight  selection) 

(h)  Size:  135x70x40  mm  (excluding  connectors  and  pigtails) 

(i)  Power:  +5,-5  ,+80  volts 

(j)  Other:  power-feed  connector,  channel-selection  monitor  LED,  input-signal  monitor  LED. 

3.  Filtering  Component 

This  component  receives  the  signal  from  the  input  fiber,  the  signal  contains  eight  multiplexed 
wavelengths.  The  selected  wavelength  signal  is  guided  to  the  output  port  (out  of  eight)  that 
corresponds  to  the  wavelength.  This  component  has  no  active  devices.  The  input  light  signal 
is  converted  into  a  concurrent  beam  by  multiple  lenses.  Each  time  the  beam  reaches  a  filter,  it 
is  divided  into  two  wavelength  signals:  one  passes  through  to  the  output  port  and  the  other  is 
reflected  to  the  next  output  port.Each  output  port  has  a  lens  circuit  that  guides  the  light  to  an 
output  fiber. 

(a)  Number  of  newly  designed  lens  types:  eight 

(b)  Filter  production  technology:  dielectric  multi-coated  filter 

(c)  Number  of  filter  types:  16 

(d)  Insertion  loss:  less  than  10  dB  (planned) 

(e)  Size:  26x26x18  mm  (excluding  connectors  and  pigtails) 

4.  Design  Evaluation 

As  stated,  the  purpose  of  this  design  exercise  was  to  determine  the  availability  of  existing 
commercial  technologies.  I  found  that  it  is  possible  to  design  both  wavelength  conversion  and 
filtering  components  if  the  system  design  parameters  listed  in  Sec.l  are  accepted.  From  these 
experimental  components,  it  is  also  possible  to  construct  a  subscriber-access  fiber  line  and  a 
switching  system  based  on  frequency  multiplexing,  even  though  technical  elaboration  is 
necessary. 

Figure  3  shows  the  LD  drive  current  vs.  the  optical  output  power  characteristics  of  the 
wavelength  conversion  component.  The  commercial  technologies  used  were  Fabry-Perot 
resonator-type  LDs  and  an  electronic  circuit.  The  horizontal  axis  includes  the  coupling  device 
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loss,  which  is  approximately  15  dB.  Although  Figure  3  only  shows  the  measurements  for  one 
sample  component,  it  clearly  shows  characteristic  changes  in  LD  drive  current  vs.  optical  output 
power  for  each  wavelength.  The  electronic  circuit  should  therefore  compensate  for  this 
characteristic  difference.  To  prevent  over-driving  the  current  on  a  LD,  a  current  limiting  circuit 
is  included. 

Figure  4  shows  the  average  insertion  loss  distribution  for  each  frequency  as  measured  for  forty 
experimental  filtering  components.  The  commercially  available  technology  used  here  was 
dielectric  multi-coated  filter  production.  The  production  technology  was  improved  a  bit  to  meet 
the  design  targets.  The  results  show  that  it  is  possible  to  design  a  very  small  passive  filter.  The 
component  size  is  26x26x  1 8mm.  Assembly  work  has  proved  that  the  alignment  of  optical  beam 
direction  is  critical.  The  figure  shows  the  results  of  manual  beam  adjustment.  The  insertion  loss 
could  be  reduced  to  10  dB  by  improving  the  optical  beam  alignment. 


LI)  drive  current  (I)  (mA) 


Iiisertkm  loss  (dB) 


Figure  3.  Wavelength  Conversion  Component. 


Figure  4.  Filtering  Component. 


This  design  exercise  showed  that : 

(a)  Both  components  were  produced  more  than  forty,  their  cost  was  reasonable. 

(b)  Eight  wavelengths  provide  an  adequate  degree  of  multiplexing  at  present.  With  more,  it 
would  be  difficult  to  use  commercial  LD  devices  and  coupling  device  loss  would  increase. 

(c)  There  is  a  technical  bottleneck  in  the  connection  between  the  LD  and  the  coupling  device. 
Laser  diode  alignment  deviation  causes  a  loss  of  more  than  1 0  dB.  A  simple  method  to  solve  this 
problem  is  required. 
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Abstract 

A  large  number  of  activities  on  2  dimensional  optical  devices  for  free  space  optical 
interconnection  have  been  done  in  order  to  attain  a  large  interconnection  through-put  for  optical 
computing.  In  addition  to  the  space  domain  processing,  optical  frequency  domain  data  processing 
can  realize  the  larger  through-put  by  means  of  a  multi-dimensional  interconnection  technology. 

Introduction 

Here,  an  optical  frequency  shifter  would  be  one  of  the  most  important  devices,  because  it  may 
realize  the  multi-dimensional  interconnection  as  well  as  the  FDM  crossconnect  technology  in 
communication  network.  (Shown  in  Fig.l)  Some  optical  frequency  (wavelength)  conversion 
devices  such  as  semiconductor  amplifier  utilizing  FDM  effect  and  bistable  laser  diodes  are  reported 
so  far.  These  have  advantageous  features  of  high  conversion  efficiencies  (>0dB)  and  large 
frequency  (wavelength)  shift  (>l()nm).  However,  there  are  also  drawbacks  on  signal  bandwidth 
(<GHz)  and  signal  modulation  code  (only  applicable  to  intensity  modulated  signal).  Rotating  half 
wave-plate  in  principle,  can  realize  bit-rate  free  and  modulation  code  free  optical  frequency 
conversion  since  it  stands  on  Doppler  effect. 

Principle 

In  this  paper,  we  report  the  fundamental  modulation  characteristics  of  waveguide  type  optical 
frequency  shifter  based  on  the  rotating  phase  plate,  as  our  results  of  this  project  in  the  last  fiscal 
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year.  We  successively  demonstrate  parallel  and  perpendicular  phase  modulation  characteristics  for 
the  effective  azimuth  control  of  the  phase  plate. 

Figure  2  shows  the  principle  of  the  optical  frequency  shifting  by  the  rotating  half-wave  plate. 
The  angular  frequency  of  right  (left)  circularly  polarized  incident  light  (O))  is  converted  to  co-cop  on 
the  rotating  phase  plate  in  clockwise  (counter  clockwise)  due  to  Doppler  effect.  Here,,  cop  denotes 
the  angular  frequency  of  the  rotating  phase  plate.  If  the  phase  plate  is  a  half-wave  plate,  the  output 
light  becomes  the  left  (right)  circularly  polarized  light  with  the  angular  frequency  of  co-2cop.  This 
process  does  not  depend  on  the  signal  bit-rate  and  mcxlulation  code  of  incident  light.  Therefore,  the 
optical  devices  based  on  this  principle  can  be  applicable  to  the  bit-rate  free  and  the  mcxiulation  code 
free  (not  only  to  intensity  modulated,  but  also  to  FSK  or  PSK  coded  signals)  optical  frequency 
shifter.  The  rotating  phase  plate  could  be  realized  by  applying  the  rotating  electric  field  to  EO 
material  effectively. 

Design 

A  schematic  view  of  the  fabricated  device  are  shown  in  Fig. 3,4.  The  waveguide  was  formed  by 
RTBE,  after  successive  growth  of  AI^  ^As  lower  clad,  GaAs  guide  and  AIq  3Gao  ^As  upper  clad 
layers  on  a  (110)  GaAs  substrate  being  6°  off  towards  <lil>B.  All  layers  were  undoped  and 
grown  by  MBE.  Because  the  light  propagation  direction  should  be  in  the  3-fold  axis  of  ' 
GaAs(<lll>),  the  waveguide  was  designed  to  be  S-shaped,  which  was  inclined  to  (110)  cleaving 
facets  by  35'’  3  Schottky  electrodes  (Cr/Au),  one  of  them  being  on  the  waveguide  and  two  of  them 
being  besides  the  waveguide,  were  evaporated  to  apply  the  rotating  electric  field  to  the  waveguide.  ! 

Experiment 

A  laser  dkxle  ()i=l  .55p  m)  was  used  as  the  input  light  source.  Both  of  TE  and  TM  modes  were  i 
launched  to  evaluate  the  phase  modulation  characteristics  induced  by  the  parallel  and  perpendicular 

I 

field,  respectively. (shown  in  Fig. 5)  As  shown  in  Fig. 6  Obtained  Vn  values  for  the  parallel  and  I 
perpendicular  field  were  lOV  and  13V,  respectively  Therefore,  by  applying  sinusoidal  electric  I 
fields  to  the  electrodes  with  adequate  phases,  the  effective  rotating  azimuth  of  the  phase  plate  can  be 
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induced  as  the  waveguide  frequency  shifter  because  of  necessity  of  the  high  voltage  and  high 
frequency  electric  driver.  However,  by  introducing  large  EO  effect  by  quantum  size  effect  such  as 
QCSE  we  believe  the  issue  can  be  overcome.  Therefore  extensive  research  on  new  nonlinear 
optical  materials  is  strongly  requested. 

Conclusion 

In  conclusion,  the  fundamental  modulation  characteristics  of  waveguide  type  optical  frequency 
shifter  based  on  the  rotating  phase  plate  have  been  reported  as  an  optical  device  for 
multi-dimensional  interconnection  for  the  first  time.  Demonstrated  parallel  and  perpendicular  phase 
modulation  characteristics  for  the  effective  azimuth  rotation  of  the  phase  plate  indicated  the 
possibility  of  the  bit-rate  and  modulation  code  free  frequency  shifter. 
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Abstract.  Digital  chaotic  behaviour  in  an  Optical-Processing  Element  is  reported. 

It  is  obtained  as  the  result  of  processing  two  fixed  trains  of  bits.  Period  doublings 
in  a  Feigenbaum-like  scenario  have  been  obtained.  A  new  method  to  characterize 
digital  chaos  is  reported 

1.  Introduction 

An  optical ly-programmable  digital  circuit  has  been  already  reported  by  us,  [l]-[2],  as  a 
Programmable  Logic  Gate.  A  brief  description  on  its  method  of  operation,  as  well  as  the  way 
it  has  been  implemented,  can  be  found  there. 

As  is  very  well  known  from  the  literature,  there  are  several  situations  where  a  chaotic 
behaviour  arises  from  electrical  and  electronic  circuits.  Most  of  the  results  concern,  and  are 
related,  to  analogue  signals.  Their  characteristics  have  been  studied  by  conventional  methods 
employed  in  any  other  nonlinear  phenomena. 

A  very  different  situation  is  present  when  the  circuit  operates  with  digital  signals  and 
the  possible  chaotic  result  is  a  signal  composed  of  "zeroes"  and  "ones".  Hence,  the  main 
objective  of  this  paper  is  to  present  a  new  method  to  obtain  the  above  mentioned  type  of 
chaotic  signals  as  well  as  an  alternative  way  to  their  study. 


2.  General  structure  of  the  Optical-Processing  Element  with  feedback. 

The  general  scheme  of  the  Cell  has  been  previously  reported  in  several  places  [l]-[3].  A  logic 
behaviour  was  reported  showing  the  possibility  to  obtain  up  to  fourteen  pairs  of  logic 
functions  from  two  digital  inputs.  Two  control  gates  allow  the  change  from  one  type  of  logic 
output  to  another.  The  cell  was  implemented  with  optical  components  although  the  non-linear 
devices  P  and  Q,  namely  an  "on-off  and  a  "SEED-like",  were  simulated  with  optoelectronic 
methods  (see  Fig.  1).  The  output  of  each  one  of  them  corresponds  to  the  two  final  outputs, 
O,  and  Oo,  of  the  cell.  The  possible  inputs  to  the  circuit  are  four.  Two  of  them  are  for  input 
data,  1,  and  Ij,  and  the  other  two,  g  and  h,  for  control  signals.  The  corresponding  inputs  to 
the  non-linear  devices,  P  and  Q,  are  based  on  these  signals  plus  some  others  coming  from 
inside  the  cell. 

The  practical  implementation  we  have  carried  out  of  the  processing  element  has  been 
based  on  an  optoelectronic  configuration.  Lines  in  Fig.  1  represent  optical  multimode  fibers. 
The  indicated  blocks,  placed  in  order  to  combine  the  corresponding  signals,  are  conventional 
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optical  couplers.  In  this  way,  the  inputs  arriving  to  the  above  mentioned  P  and  Q  devices,  are 
multilevel  signals.  These  devices  have  been  simulated  electronically.  Optical  signals  were 
converted  to  electrical  by  conventional  photodiodes  and,  after  processing,  converted  again  to 
optical  signals  by  LEDs.  More  details  can  be  seen  in  reference  [3]. 

A  new  situation  appears  when  some  type  of  feedback  is  added  to  the  cell.  Moreover, 
in  order  to  have  the  possibility  to  work  with  some  more  parameters,  a  time  delay  has  been 
added  to  the  feedback.  Another  time  delay  has  been  introduced  inside  the  own  cell.  This  time 
corresponds  to  response  time  of  the  non-linear  devices  that,  in  our  previous  case,  were 
optoelectronic  simulations.  The  general  configuration  appears  in  Fig.  1,  where  these  delays, 
as  well  as  the  whole  cell  configuration,  is  shown. 


Figure  1.-  Optical-Processing  Element  with  Feedback.  White  boxes  are  2  x  2  or  2  x  1  couplers. 


As  it  can  be  shown,  there  are  several  possibilities  to  add  feedback  to  the  cell.  Any 
connection  between  one  of  the  two  possible  outputs,  O,  or  Oj,  and  any  of  the  four  different 
inputs,  namely,  I,,  Ij,  g  and  A,  should  give  feedback.  But  results,  depending  on  the  adopted 
configuration,  have  to  be  different.  Because  the  P-device  output  has  more  possible  different 
output  functions,  depending  on  its  control  signal,  namely  seven,  than  the  Q,  its  output  has 
been  used  for  feedback.  This  signal  will  be  the  control  signal  g  for  device  P.  Figure  1  shows 
the  employed  circuit.  A  computer  simulation  has  been  employed  for  the  rest  of  present  work. 


3.  General  behaviour  of  the  cell. 

The  first  analysis  that  we  have  performed  considered  null  delay  times.  This  situation  has  no 
analytic  solution  and  no  data  were  obtained.  The  circumstances  are  strongly  different  if  we 
introduce  finite  delay  times,  namely,  internal  and  external  delays. 

According  to  previous  studies  [4],  the  situation  with  more  probability  to  give  a  periodic 
or  even  chaotic  solution  is  when  the  internal  delay  time  is  shorter  than  the  external  one.  In  any 
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case,  input  has  been  a  regular  train  of  pulses.  The  input  to  the  non-linear  device  is  a  multilevel 
signal  corresponding  to  the  addition  of  the  two  periodic  inputs.  The  period  of  this  signal 
corresponds,  in  the  case  studied,  to  a  time  of  14  milliseconds. 

If  the  ratio  between  internal  delay  time  and  external  delay  time  is  smaller  than  1  ms, 
we  obtain  a  periodic  situation.  The  period  of  this  signal  is  strongly  dependent  on  the  ratio 
value.  In  the  particular  case,  where  external  delay  time  is  200  ms  and  internal  delay  are  2,  4 
and  12  ms,  obtained  results  are  summarized  in  Table  I.  An  interesting  result  is  the  duplication 
in  period  time  when  the  ratio  between  delays  gets  smaller.  In  our  case,  it  goes  from  70  to  280. 
Hence,  frequency  doubling  has  been  obtained.  This  result  is  one  of  the  best  indications  of  a 
possible  route  to  chaos. 


TABLE  I.-  Characteristics  of  the  output  signals,  according  to  the  delay  times. 


‘p 

t/t. 

Period 

14 

200 

2 

0.01 

280 

14 

200 

4 

0.02 

140 

14 

200 

12 

0.06 

70 

Values  given  at  Table  I  do  not  correspond  to  the  real  transition  points  between 
different  periods.  These  values  are  in  a  range  where  the  period  remains  constant.  If  we 
calculate  the  equivalent  to  the  Feigenbaum  ratio  for  the  indicated  values,  a  value  of  4  is 
obtained.  But  if  higher  order  transition  points  are  taken  into  account,  a  number,  closer  to  4.6, 
has  been  obtained. 

As  it  is  can  be  seen  in  Table  I,  as  the  internal  delay  time  goes  to  smaller  values,  the 
period  of  the  output  signal  gets  higher  and,  eventually,  becomes  chaotic.  This  situation  has 
been  obtained  only  by  computer  simulation  with  internal  delay  time  zero.  Experimentally,  we 
have  not  tried  to  obtain  it.  A  sample  of  such  a  situation  is  shown  in  Fig.  2. 


Figure  2.-  Output  from  the  logic  cell  when  a  chaotic  behavior  is  present. 


In  order  to  characterize  the  obtained  chaotic  signal,  conventional  methods  are  very 
difficult  to  apply.  A  problem  related  with  the  above  results  is  the  presence  of  a  digital  signal 
with  just  to  two  values,  "O"  and  "1".  Methods  employed  with  analogue  signals  are  not 
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applicable  here.  Hence  a  new  technique  has  to  be  implemented.  The  method  we  have  adopted 
is  to  group  sets  of  four  bits  and  to  convert  them  to  their  corresponding  hexadecimal  value. 
Hence,  for  example,  "0010"  would  be  a  "2",  "1001"  a  "9"  and  "1110"  a  "14". 


Figure  3.-  Diagram  vs.  for  a  digital  chaotic  signal  as  in  Fig.  2. 


A  diagram,  similar  to  the  t^+i  versus  t;  in  analogue  signals,  has  been  obtained  here.  In 
the  ease  of  periodic  signals,  a  regular  configuration  is  obtained.  But  in  the  case  of  chaotic 
signals,  no  definite  pattern  is  obtained.  This  situation  appears  in  Fig  3. 


4.  Conclusions 

A  new  type  of  digital  chaotic  signal  has  been  presented.  It  is  the  result  of  a  feedback  in  an 
optical  processing  logic  cell,  previously  reported.  According  to  the  reported  results,  a  new 
possibility  to  study  digital  chaos  has  been  employed  .  It  is  based  in  the  conversion  from  binary 
to  hexadecimal  signals.  Diagrams  t,,,  vs.  t,  can  be  obtained  with  our  method. 
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Abstract.  The  possibility  of  self-focussing  and  difftactionless  propagation  of  light  beams  in 
linear  biaxial  gyrotropic  crystals  has  been  shown.  Asymmetrical  components  of  Kerr-like 
nonlinearity  tensor  being  taken  into  account  may  lead  to  instability  of  the  slow  component  of 
fibre  modes  and  also  to  self-rotation  of  the  polarisation  plane  of  linearly-polarised  incident 
light. 


1.  Focussing  and  diffractionless  propagation  of  light  beams  in  biaxial  gyrotropic 
crystals 

Further  development  of  methods  of  forming  spatial  structures  of  light  beams  is  based  on  the 
use  of  different  spatially  nonuniform  elements  -  lenses,  waveguides  and  cavity  structures,  etc. 
There  is  quite  a  different  way  to  control  light  beams  spatial  structure  which  is  based  on  the  use 
of  anisotropic  media.  Induced  by  anisotropy,  changing  of  surfaces  of  eigenmode  wave  vectors 
is  more  radical  in  acoustics  and  results  in  phonon  self-focussing  and  self-trapping  [1]. 
Observation  of  effects  of  such  kind  in  optics  is  more  difficult  owing  to  conservation  of  the 
positively  curved  wave  vector  surface  in  the  presence  of  anisotropy.  The  exception  here  is 
light  propagation  in  the  vicinity  of  optical  axes  in  biaxial  crystals.  The  availability  of  small 
perturbations  removing  the  degeneracy  of  phase  velocities,  e.g.  gyrotropy,  tends  to  substantial 
reconstruction  of  the  wave  vector  surface  structure  in  the  vicinity  of  axes  and  to  the 
appearance  of  regions  with  negative  curvature  (or  close  to  zero)  which  makes  possible,  in  its 
own  turn,  photonic  self-focussing  and  self-trapping. 

1 .1  The  structure  of  the  surface  index  of  refraction  in  the  vicinity  of  binormals 

Let  us  represent  beams  in  anisotropic  crystals  in  terms  of  the  integral 

=  ,  (1) 

where  the  wave  vector  k  of  partial  wave  is  taken  as  £  =  k^{q)n  +  q,  where  n  is  the  unit  vector 
in  the  beam  axis  direction.  For  weakly  diverging  beams  the  following  expansion  is  valid  [2]: 


(2) 
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where  tensors  of  m-rank  =  3  “"co  -  light  frequency,  v,  n©  -  phase  velocity  and  index 

of  refraction  in  the  beam  axis  direction,  ko  =  co^c;  ui  -  the  vector  of  group  velocity,  W2  /  ^  ■ 
curvature  tensor  of  wave  vector  surface. 


From  Maxwell’s  equations  for  gyrotropic  anisotropic  media,  when  the  beam  axis  and 
binormal  exactly  coincide  one  may  obtain  the  following  expressions  for  uiq  and  U2qq: 

_i 

=  =  .  (3) 


e„-==  ej-‘-ej' 


P  =  ^er‘+er>± 


v'^  I ±G,G  -  projection  of  gyration  vector  on  binormal  direction,  =  ql  +  ql,  upper 
and  lower  signs  refer  to  fast  and  slow  waves  accordingly.  With  (3)-(4)  being  taken  into 
account  it  follows  that  principal  values  of  curvature  will  be  Wij  =  M22  ”  P  /  ^o'^o- 


From  this  relationship  it  follows  that  in  the  range  of  the  crystal  parameters,  where 
(63“^ ^  (^3~^ ^  ^2  t«come  negative,  that  means  concave 
formation  at  wave  vector  surface  of  the  slow  wave. 


12  Gaussian  beam  propagation  in  a  crystal  with  negative  and  zero  curved  wave-normal 
surface 

We  consider  two  kinds  of  scheme  for  propagation  inside  the  crystal,  when  the  mcident  beam  is 
either  a  divergent  or  a  convergent  gaussian  one,  accordingly. 

In  the  first  case  the  output  gaussian  beam  radius  and  its  phase  front  curvature  radius 
are  determined  from  the  expressions: 

0)^(?J  =  (0„M+  , /?(z)  =  n„z  ,  (5) 

where  z  =  +  z^p  I n^,  is  the  distance  from  waist  to  the  front  face  of  the  crystal,  z^  is  tiie 

distance  passed  by  light  inside  the  crystal  of  length  L.  Beam  parameters  at  the  output  of  the 
crystal  (z^  =  l)  are  determined  by  (5),  only  there  R  is  replaced  by  ngR . 

From  (5)  it  follows  that  the  availability  of  the  crystal  parameter  p  ¥=  1  implies  the 
character  of  beam  diffraction.  We  consider  regime,  when  p  ^  0. 

12.1  D ijfractionless  beam  propagation 

Two  variants  may  occur:  a)  Zj  ^  0.  Then  z  =  Zj,  co^  and  also  R  do  not  depend  on  distance 
passed  inside  the  crystal  and  therefore  diffractionless  propagation  occurs,  b)  z^  =  0.  Then 
z  =  0,  and  co^  =cOo ,  P  =  0,  that  is  extending  the  waist  of  the  gaussian  beam  on  the  crystal 
length  occurs. 
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1 .2 .2  Beam  focussing 

Let  p  <  0.  The  parameter  z  will  be  decreased  while  light  propagates  inside  the  crystal  that  is 
the  beam  radius  will  be  decreased.  The  specific  manifestation  of  a  focussing  effect  herewith 
will  essentially  depend  on  the  parameters  zi,  p  and  L.  If  z  tends  to  zero  while  Z2  =L,  then 
focusing  at  the  back  face  of  the  crystal  occurs.  If  z  =  0  while  Z2  <L,  then  the  beam  first  is 
focussed  inside  the  crystal  at  the  distance  za  =  -zjp  from  the  input.  Then  the  beam  is  diverged 
diffractionally  but  has  negative  phase  front  curvature  that  tends  to  give  secondary  focussing 
outside  the  crystal  at  the  distance  F  ~  -(z^  -\-Lpln^  from  the  output  of  the  crystal. 

We  investigate  now  the  case  when  a  convergent  gaussian  beam  hits  the  crystal  surface. 
For  its  formation  one  must  supplement  the  optical  scheme  considered  above  with  a  lens  with 
focal  length  F. 

The  solution  of  the  diffraction  problem  in  this  case  has  the  form  of  (5)  with  the 
replacement  z  -4  zi ,  where  zi  =  z^  - F  +  Z2P / 

For  the  case  when  zi  <  F.  Focussing  by  lens  occurs  inside  the  crystal  while  when 
zi  >  F  the  problem  is  reduced  to  that  considered  above.  We  consider  two  cases  here. 

1)  Let  zi  <  F,  p  <  0.  Then  zi  <  0  corresponds  to  propagation  inside  the  crystal  and 
diffraction  divergence  of  a  beam  with  concave  wave  front.  The  beam  focussing,  having 
stopped  inside  the  crystal  goes  on  outside,  and  the  waist  is  at  a  distance  of  F^  =  -Zi(L)  from 
beam  output. 

2)  Let  zi  <  F,  /7  =  0.  In  this  case  focussing  inside  the  crystal  is  stopped  and  a  beam 
with  a  negative  wave  front  curvature  radius  and  free  fiom  diffraction  is  formed.  Further 
focussing  occurs  after  its  output  from  the  crystal.  Thus  in  the  case  presented  effects  of 
suppression  of  the  divergence  and  the  focussing  are  realised  subsequently. 

The  study  carried  out  shows  that  biaxial  gyrotropic  crystals  possess  qualitatively  new 
optical  properties  for  light  propagating  in  the  vicinity  of  the  optical  axes.  The  investigation  of 
effects  of  focussing  and  diffractionless  propagation  is  attractive  for  realisation  of  optical 
interconnections  in  optical  information  processing  schemes. 

2.  Light  propagation  in  birefringent  optical  fibres  with  anisotropic  Kerr-like 

nonlinearity 

Assume  a  nonlinear  polarisation  of  the  form 

=  Xywto  =  -0>+CO  +  co)F*F^F,  (6) 

The  tensor  Xyw  is  known  to  possess  frequency-reversal  property  of  symmetry  and 
therefore  in  the  notation  presented  is  symmetrical  in  the  last  two  indexes  [3,4],  We  write  it  as 
a  sum  x-.ju  =  +  where  and  are  symmetrical  and  asymmetrical  on  index 

reversal,  e.g.  Xiyw  such  a  representation  each  of  the  tensors  and  x^u  is 

symmetrical  in  the  first  pair  of  indexes.  So  in  further  calculations  we  use  6-dimensional 
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indexes  x^(^y  =  We  examine  several  optical  effects  detennined  by  xTe  ^ 
taken  into  account. 

2.1  Account  of 

From  equations  for  coupled  modes  it  follows  that  energy  is  conserved  if  the  component  is 
purely  imaginary,  that  is  Xi6  =*Xi6>  that  is  clear  fixwn  the  genoral  condition  of  hermitivity  for 
tensors  for  nonabsorbing  media.  By  choosing  an  appropriate  circularly  polarised  basis  one  can 
show  that  nonlinear  interaction  influences  only  the  mode  phases.  Therefore  one  may  obtain 
the  equation  for  the  phase  difference  Y  =  9+  “  9-  • 

^=-xn(kr-kr)+x4kr+kr)  • 

As  appears  from  (7),  a  nonlinear  phase  shift  and,  therefore,  self-rotaticm  of  the  polarisation 
plane,  in  comparison  with  the  case  of  X  i6  =  ^  [5,6],  occurs  for  linearly  polarised  incident  light 

Account  for  Xn 

The  coupled  mode  equations  have  the  form 

-i^  =  {xh+iX‘i)a’b'^  +  {^XuW  +  XnH^  .  W 

-<^  =  (Zi2-‘Xak«^+(2X66l«P  +  Xa2W^-i)*  ’ 
where  k  is  proportional  to  the  linear  anisotropy.  Questions  concerning  stability  of  the  slow 
mode  are  studied  below. 

We  introduce  both  the  parameters  of  anisotropy  Aii=(x22”Xii)/^» 
^u  =  ^X66  +  3Ci2“(Xii  +  X22)/2  and  a  parameter  xj,  charactOTsing  the  relative  level  of 
excitation  of  cHthogonally-polarised  modes.  When  xj  =  0,  the  slow  naode  is  excited,  which 
was  stable  without  account  of  the  Kerr-like  anisotropy  [5].  Now  it  loses  stability  when 
(2^  -  A <  1,  where  lc  =  k  lal  ,al  -fht  total  intensity. 

Under  self-consistent  change  of  x]  and  Oq,  when  xj  =  (i  +  (Aii -2^)/ Ai^)/!,  the 
slow  mode  keeps  the  stability  while  intensity  increases.  The  polarisation  of  the  mode  is 
elliptical  and  is  changed  from  linear  on  the  Y-axis  (polarisation  of  slow  mode)  when 
2^  =  All  -  Ai2  to  linear  on  the  X-axis  when  2k  =  An  +  Aij. 
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Abstract.There  are  analyzed  potential  parameters  of  the  planar  AO  modulators  for  space- 
and  wavelength  photonic  switching.  We  consider  different  schemes  of  superfast  AO 
switching  with  capability  up  to  10*^  switch./sec  within  one  multichannel  guided  wave  device. 
On  the  basis  of  the  experimentally  acheved  parameters  of  the  planar  Ti:LiNb03  AO 
modulators  we  discuss  a  few  prospective  applications  in  high  speed  digital  multipliers, 
commutators  for  optoelectronic  super  computers  and  associative  memory  systems. 

1.  Introduction 

Recent  developments  in  acoustooptics  show  a  fairly  good  set  of  main  parameters  of  the 
acoustooptic  modulators  (AOM)  appropriate  to  perform  optical  beam  control  and  the 
optical  and  radio  signal  processing  [1].  The  most  important  of  these  are  the  amplitude 
dynamic  range  (generally  Na~200-h250),  the  spatial  deflection  to  many  resolvable 
directions  (Nq~500),  the  number  of  resolvable  sound  frequencies  (Ng-SOO)  and  the 
polarization  planes  (Np<2).  The  typical  risetime  for  all  kinds  of  the  mentioned  switchings 
is  about  the  sound  transit  time  (Tg-lOps),  which  yields  the  estimation  for  potential 
switching  speed  up  to  S  ~  (N^  •  N@  ■  Nf  ■  Np)/Ts  «  10^^  switch/sec.  This  potential  in 
switching  capability  of  an  AOM  seems  very  high  .  Unfortunately  it  is  a  big  problem  to 
realize  all  its  profits  because  of  the  absence  of  special  algorithms  and  related  pre-  and 
post-  electronic  circuits.  One  of  the  appropriate  ways  for  that  is  the  design  of  systems 
with  spatially  distributed  signals  inside  multichannel  AOMs.  The  most  progressive  way  is 
the  implementation  of  the  recently  developed  planar  AOMs  based  on  Ti-diffused  LiNbOs 
waveguides  and  of  surface  acoustic  wave  (SAW)  propagation  [2].  In  this  case  the 
additional  advantages  like  the  simplicity  and  the  reproducibility  of  the  planar  technology, 
the  small  sizes  and  advanced  compatibility  with  ordinary  electronics  should  be  acheved. 
This  paper  takes  deals  with  the  analysis  of  the  potential  characteristics  of  some 
prospective  schemes  of  AO  switching  devices  based  on  multichannel  planar  AOMs, 
directed  to  implementations  in  digital  matrix-vector  multipliers,  associative  optoelectronic 
memory  and  optical  channel  commutators  for  fiber-optic  telecommunications  and 
optoelectronic  supercomputers. 

2.  Switching  rate  analysis  for  planar  acoustooptic  modulators 

In  [2]  very  high  performances  have  been  demonstrated  for  for  special  a  type  of  AOM 
based  on  the  light  guided  modes  interaction  with  SAW  under  its  collinear  propagation  in 
planar  Ti-diffused  waveguides  in  YX-LiNb03,  which  results  in  the  radiative  substrate 
mode  with  90®-rotation  of  the  plane  polarization  (see  Fig.  1).  Under  fixed  light  frequency 
Vq  it  showed  a  very  high  resolution  of  radio  (sound)  frequencies  for  a  set  of  moderate 
other  parameters  in  multichannel  planar  AOMs  (^o-0-63}.im):  the  central  frequency  fo« 
550MHz,  the  bandwidth  Af^^^SO^lOOMHz,  the  radio  frequency  resolution  5fs>50kHz  the 
SAW  transit  time  Ts<20|.is,  the  number  of  resolvable  sites  of  frequency  Ns»10^,  the 
linear  dynamic  range  for  input  signals  DD<35  dB  and  the  diffraction  efficiency  r|o~ 
10%AV  [3,4].  The  simple  estimation  shows  that  it  supports  the  switching  speed  up  to  Si» 
Ns/Ts~108  operations  per  second.  The  expanded  version  with  many  light  frequencies  Nl 
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(so  called  optical  frequency 
multiplexed  regime)  with  Nl» 
Ns~10^  will  obviously 
demonstrate  more  enhanced 
switching  capability 

Simax~10^^  switch./sec.  (see 
Fig.  2).  According  [3]  it  might 
be  simply  designed  up  to  20 
separated  AO  channels  on  one 
chip  of  LiNbOs,  which  means 
the  respective  growth  of  the 
switching  capability  limit  up  to 
S20max~10^^  switch./sec. 

Under  all  of  earlier 
estimations  it  has  been 


Fig.l.  Single  channel  planar  spatial  frequency  AO  switch  on 
base  of  Ti:LiNb03  with  collincar  wave  interaction. 


proposed  that  the  time  for  one  command 
to  perform  many  parallel  operations  is 
limited  by  the  transit  time  of  SAW 
(Ts~10ps),  but  sometimes  it  might  be 
changed  to  the  light  transit  time 
TL~0.1ns  with  the  proportional  growth 
of  the  potential  switching  capability  up 
to  10^^  switch./sec. 

Here  we  shall  consider  a  several 


commutator  schemes  optimized  for 
utilizing  planar  multichannel  AOMs.  Fig. 
3  shows  the  scheme  for  an  AO  digital 
matrix-vector  multiplier  based  on  the 
DMAC-algorithm  in  the  spectrum  domain 
[5].  It  works  under  the  special  relations 
between  the  multifrequency  light  and 
sound  waves  (when  the  set  of  light 
frequencies  Vi  =  voq-i,  with  i-l,2...,N,  is 
fitted  to  the  SAW  frequencies  fj  =  foq^ 
with  i=l,2...,N,  where  q  is  a -base  of  the 
harmonic  sequence).The  result  of  the 
DMAC  for  two  spectral  binary  codes  to 
perform  digital  multiplication  of  the 
"optical  matrix" 

Smk  =  amk(vi,V2...  Vn)  and  the  "sound 
vector"  =  b|^(fi,f2...fN)  .The  output 

shows  consecutively  in  time  the  resulting 


Fig.2  The  same  as  Fig.l  with  multiwavelength 
optical  input. 


SeuBd: 


Ti-vaveguide 


Fig.3  Digital  matrix-vector  multiplier  scheme  using 
multichannel  planar  AO  modulator  on  Ti:LiNb03 
with  multi  frequency  SAW  and  multiwavelength 
optics. 

:tor  components  Cj  =  mixed  code 

i=l 


presentation.  Looking  for  optimal  matching  of  all  related  signals  and  taking  into  account 
the  real  planar  AOM  parameters  one  can  expect  the  interbit  switching  capability  for  this 
scheme  as  high  as  ~  10^^  oper./sec.  with  the  light  pulses  duration  TL~lns  [5].  It  seems  to 
be  very  attractive  as  an  arithmetical  processor  for  fast  solution  of  linear  algebraic 
multiple  equations  with  rates  up  to  10^^  multiplication  and  adds  per  second  for  16-bit 
coded  operands.  But  recently  this  kind  of  development  is  seriously  limited  without 
adequate  speed  of  ADC  function  for  decoding  of  the  AOM  outputs  (see,  for  example, 


[6]). 

If  the  optical  input  of  this  scheme  is  transformed  to  the  single  light  frequency  vo  (Fig. 
4)  it  gives  "immediately"  (after  the  light  transit  time  Tl)  the  results  of  the  inner  vector 


631 


products  of  the  "optical  vector"  {ak}  by  the  series  of  "acoustical  vectors" 
{bk}j={bk(fl,f2-fl)}.  With  Timin~0.  Ins  it  gives  maximum  capability  as  high  as 
Smax  ^  (Ns  •  K(K  -  l))/TLniin  ‘  lO^^switch/sec.  This  type  of  scalar  vector  multiplier 
is  very  compatible  to  perform  the  fast 
comparison  of  the  "optical  word"  with 
the  archives  of  many  "acoustical 
words"  inside  of  the  associative 
memory  device  for  a  new  generation 
of  the  optoelectronic  supercomputers 
[7]. 

The  simplest  high  fidelity  planar 
AO  switch  for  associative  memory 
systems  might  be  built  on  the  scheme 
of  Fig.  5.  It  uses  the  most  efficient 
AO  transition  inside  of  planar 
waveguide  between  a  pair  of  the 
lowest  optical  modes  with  orthogonal 

planes  of  polarization  TEo<->TMo  Fig.4  Scheme  of  AO  commutator  for  associative 

(rio=l%/mW).  The  diffracted  light  memory  system  utilising  frequency  multiplexed  acoustic 
might  be  selected  from  the  input  one  signals, 

by  using  thin  metallic  film  polarizer  at 
the  ends  of  every  AO  channel  so  the  j- 
output  will  consist  of  the  result  of  the 
desirable  inner  product 

N 

Ck  =  Xbk(fi)^('^i)  if  ^be  optical  and 
i=l 

sound  frequencies  satisfy  the  ratio  (Vj/f^) 

=  const.  The  capability  limit  for  the  case 
of  20  channel  AOM  is  estimated  as  Sw 
20-103/1010«2.1014switch./sec.  This  kind  5  Pig  4  frequency 

of  the  AO  commutators  seems  be  very  multiplexed  light  signals, 

reliable  in  a  form  compatible  with 
fiberoptic  and  integrated  optic  peripheral  devices. 

The  scheme  of  Fig.  5.  must  work  also  in  the  regime  of  the  optical  channel  commutation 
generally  applicable  for  optoelectronic  computers  and  fiber  optic  telecommunication 
systems.  In  fact,  if  K=N  each  input  optical  channel  defined  by  frequency  vi  might  be 
driven  to  any  of  spatial  outputs  by  appropriate  choice  of  SAW  sound  frequency  fj 
(because  of  ni/fi=const).  That' means  the  distribution  of  all  sound  frequencies  between  all 
AOM  channels  will  fix  respectively  the  distribution  of  all  frequencies  of  the  light  between 
output  channels. 

Following  [2]  it  is  easy  to  find  out  the  optimal  relations  of  the  planar  AO  commutator 
parameters: 

,1o  =  sin4M2.F.Ps^^f,  (1) 

L  p-d  J 

k  =  p-foTs,  (2) 

where  p  =  is  the  fractional  frequency  bandwidth  of  AOM,  M2  is  the  material 

AO  merit  of  quality,  Vg  and  fo  are  the  velocity  and  the  central  frequency  of  SAW,  F  is 
the  overlapping  integral  of  sound  and  light  waves,  Ps  and  d  are  the  power  and  the  beam 
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width  of  SAW.  From  (1)  and  (2)  one  can  estimate  the  appropriate  set  of  parameters.  For 
example  in  the  case  of  a  20-channel  AOM  based  on  TiiLiNbOs  with  driving  power  Ps» 
O.IW  per  channel  it  gives  the  switching  time  Ts~0.1/ts.  This  scheme  with  planar  AOMs 
suffers  from  excess  optical  losses  (K-times)  due  to  optical  splitting  into  K-channels  on 
the  input  side  and  the  limited  number  of  the  planar  channels  per  chip  (Kmax^  20)  .  Both 
of  these  parameters  probably  might  be  developed  with  the  implementation  of  channelized 
AOMs  based  on  directional  mode  coupling  in  Ti-diffused  strip-line  optical  waveguides 
combined  with  TIPE  acoustic  waveguides  on  LiNbOs  (see,  for  example,  [8,9]).  In  this 
case  the  number  of  AO  channels  per 
chip  is  limited  by  the  diameter  of 
standard  fibers  (not  less  ~  125pm). 

One  can  estimate  as  K^ax^  1^- 
According  to  [8]  the  appropriate 
length  of  the  AO  strip  -  type  coupler 
Lc  ~  1  cm,  which  yields  the  limited 
number  of  the  sequential  couplers 
along  the  chip  as  8.  So  one  can 
propose  the  scheme  shown  in  Fig.  6 
with  the  compromised  number  of 
parallel  (Ki<20)  and  sequential  (K2 
<8)  AO  couplers  to  perform  arbitrary 
optical  commutations  for  160 
channels  during  time  Ts~  3ps. 

3.  Conclusion 

There  were  shown  above  very  high  limits  of  the  potential  switching  capability  in  the 
space  and  wavelength  domains  for  a  few  schemes  of  AO  commutators  based  on 
Ti:LiNb03  planar  waveguides  with  good  promises  for  some  actual  applications.  The 
results  of  the  paper  showed  the  necessity  of  the  realization  and  the  detailed  investigation 
of  full  scale  experimental  prototypes  of  the  proposed  AO  commutators  to  find  out  the 
supposed  restrictions  due  to  the  second  order  effects,  the  technological  imperfections, 
etc. 
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Abstract  This  paper  describes  an  acousto-optic  FM  demodulator  that  offers  improvements  in 
noise  rejection  performance  over  early  systems.  Theory  is  developed  and  the  system  is  modelled 
and  results  are  presented.  It  is  shown  that  the  system  is  capable  of  giving  up  to  25  dB  of  AM 
rejection. 

1.  Introduction 

The  ability  to  demodulate  FM  signals  has  many  applications,  especially  in  the  area  of 
communications  and  equipment  test.  Conventional  methods  such  as  the  PLL  have  provided 
general  purpose  solutions  but  in  many  applications  they  do  not  have  the  required  performance. 
The  acousto-optic  FM  demodulator  [1,2]  offers  great  promise  but  its  potential  performance  has 
been  limited  by  AM  signal  content  and,  more  importantly,  laser  noise  [3].  The  system  suggested 
here  overcomes  these  limitations  for  only  a  modest  increase  in  system  complexity. 

2.  System  architecture 

A  schematic  diagram  of  the  system  is  shown  below  in  figure  1. 


figure  1  System  Schematic 
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It  consists  of  a  laser,  which  illuminates  an  aperture  in  an  acousto-optic  (AO)  cell.  The  AO  cell  is 
driven  by  the  signal  of  interest  via  an  RF  amplifier.  The  AO  cell  deflects  the  AO  cell  through  an 
angle  proportional  to  the  frequency  of  the  applied  signal.  The  deflected  beam  is  focused  onto  a 
photodetector  assemble  consisting  of  a  bi-cell  photodiode  via  a  simple  lens.  The  bi-cell  is 
arranged  so  that  when  an  undeviated  carrier  is  used  to  drive  the  AO  cell  the  deflected  beam 
illuminates  each  half  of  the  bi-cell  equally.  Any  change  in  frequency  will  cause  the  beam 
deflection  to  change  and  hence  the  illumination  on  each  half  of  the  diode  will  vary. 

The  resulting  photocurrent  from  each  half  of  the  bi-cell  is  amplified  by  a  transimpedance 
amplifier  and  the  resulting  voltage  outputs  go  to  a  sum-difference  amplifier.  The  resulting 
outputs  give  the  AM  and  FM  contents  of  the  signal  respectively. 

3.  System  Analysis 

The  deflected  intensity  profile  from  the  AO  cell  can  be  described  by  a  truncated  Gaussian  thus: 


Where  P  is  the  total  power  in  the  deflected  beam,  is  the  beam  width  standard  deviation  of  the 
beam  at  the  photodetector  assembly.  P  is  a  function  of  both  laser  intensity  and  AM  modulation 
on  the  carrier.  Since  our  detector  is  a  bi-cell  device  and  is  positioned  such  that  the  undeviated 
beam  falls  on  each  have  of  the  bi-cell  equally  then 
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where  K  is  the  ratio  of  the  signal  power  (P(,  or2)-P/2)  to  the  background  invariant  power  P.  If  the 
signal  from  each  half  of  the  photodiode  is  amplified  by  a  transimpedance  amplifier  of  gain  G 
and  then  the  difference  voltage  (AV)  taken  we  obtain. 

^V^GR^PK 


Where  R;^  is  the  photodiode  responsivity  measured  in  A/W,  assumed  to  be  equal  for  each  half  of 
the  photodiode. 

The  value  of  K  can  be  expressed  as  a  function  of  acoustic  transit  time  t  and  frequency  deviation 
[3]  thus 

a:  =  7271x8/ 

Therefore 

AV  =  GR^P4^T:5f 

The  output  signal  from  the  difference  amplifier  is  a  direct  and  linear  function  of  both  the  applied 
frequency  and  the  incident  optical  power. 

The  above  relationship  can  be  compared  to  the  output  signal  from  a  single  knife  edge  system, 
given  by 

V  =  GR,P/^(l  +  K) 

For  small  deviations  ( 6x  <  )  then  K  «  1  and  the  effect  of  variations  in  optical  power  P,  a 

function  of  laser  intensity  and  AM  modulation  on  the  AO  drive  signal  will  have  a  far  greater 
effect  than  in  a  bi-cell  scheme.  We  define  the  AM  rejection  as  the  inverse  of  the  AM  gain,  the 
AM  rejection  ratio  (AMRR)  of  the  bi-cell  scheme  relative  to  the  single  knife  edge  system  is 

AMRR  =  —^ - =  — 

GR^PK  2K 

The  AMRR  is  inversely  proportional  to  applied  frequency.  This  is  intuitively  correct:  with  no 
frequency  deviation  both  diodes  receive  equal  power,  any  variations  due  to  intensity 
fluctuations  or  AM  signal  content  are  common  mode  and  hence  cancel.  If  a  frequency  offset 
occurs  the  beam  will  be  displaced  and  there  will  be  a  difference  in  the  respective  photocurrents 
and  so  any  intensity  variations  cannot  be  totally  removed. 

If  we  express  the  AMRR  in  terms  of  dB  and  we  use  ‘normal’  system  parameters  (t  =  800  ns) 
then  we  can  write 

AMRR{dB)  =  54  -  101og(6/) 


This  remains  positive  for  frequency  deviations  of  325  kHz,  this  is  close  to  the  linear  operating 
range  of  the  system  [2]. 

Total  independence  from  AM  signal  content  may  be  achieved  by  taking  the  ratio  of  the 
difference  (AV)  and  sum  (SV)  signals  thus 

AF  GR.PK  ^ 

ZV  GR^P 

This  has  no  AM  content  and  is  a  direct  function  of  FM  deviation. 

This  can  be  achieved  by  an  analogue  multiplier  configured  as  a  divider  but  this  restricts  both 
system  bandwidth  and  frequency  resolution. 


4.  Experimental  Results 

The  system  used  consisted  of  a  stabilised  He-Ne  laser  illuminating  an  Isomet  OPT-1  Te02  cell 
the  deflected  beam  was  imaged  onto  a  bi-cell  diode  via  a  simple  f-f  lens  system.  The 
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photocurrent  from  each  diode  was  amplified  by  a  low  noise  transimpedance  amplifier  and  the 
sum  and  difference  signals  taken  on  a  digital  oscilloscope.  The  system  was  aligned  such  that  a 
45  MHz  carrier  caused  a  beam  deflection  which  illuminated  each  half  of  the  bi-cell  equally. 

Any  change  in  frequency  caused  a  change  in  the  respective  illumination  intensities  which  was 
measured.  The  45  MHz  carrier  was  AM  modulated  at  1  MHz  while  the  carrier  frequency  was 
manually  shifted  from  45  MHz  in  500  Hz  steps.  The  resulting  level  of  AM  breakthrough  was 
measured  and  the  AMRR  was  calculated. 

Figure  2  shows  the  AMRR  of  a  practical  system  compared  to  theoretically  modelled  results.  At 
low  deviation  (  <  1  kHz)  we  see  a  limiting  value  of  around  20  dB  failing  to  10  dB  at  25  kHz. 


Rejection  (dBs) 


Deviation  (kHz) 

figure  2  Theoretical  and  Experimental  Results 

5.  Conclusions 

It  has  been  shown,  both  in  theory  and  practice  that  the  Bi-cell  acousto-optic  FM  demodulator  is 
capable  of  offering  high  levels  of  AM  and  intensity  noise  reduction.  For  narrow  deviation 
signals  the  high  levels  of  AMRR  ( >  20  dB)  is  far  greater  than  the  3  dB  offered  by  the  original 
system.  The  difference/sum  ratio  system  offers  higher  levels  of  AMRR  at  the  expense  of 
increased  cost,  reduced  bandwidth  and  decreased  frequency  resolution.  At  present,  the 
difference  amplifier  gives  good  performance  with  only  a  very  small  increase  in  price  over  the 
original  single  diode  system. 
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Abstract.  The  performance  of  an  8  x  8  polarisation-independent  TiiLiNbOs 
switch  matrix  is  described  on  the  basis  of  a  new  type  of  directional  coupler  with 
its  full  complement  of  the  local  area  network.  A  reverse  Db  electric  field 
configuration  is  used  for  both  the  “bar”  and  “cross”  state  and  no  attempt  is 
made  to  tune  individually  the  two  states.  Some  possible  integrated-optical 
devices  and  the  optical  versions  of  multistage  nonblocking  interconnection 
networks  are  discussed. 


1.  Introduction 

The  importance  of  the  interconnection  networks  in  tightly  coupled  multiprocessor  systems 
has  been  pointed  out  by  many  authors  [1-3],  These  networks  become  important  when  the  high 
number  of  system  elements  to  be  connected  makes  the  architectures  based  on  a  shared  bus 
inefficient.  In  particular,  in  large  structures,  multistage  networks,  increasing  according  to 
0{N\oh2N),  turn  out  to  be  important  [4].  The  possibility  of  achieving  computer  systems,  or 

parts  of  them,  with  optical  components,  seems  to  be  particularly  attractive  due  to  the  various 
advantages,  such  as  electromagnetic  interference  immunity,  intrinsic  parallelism  of  the  optical 
systems,  and  the  consequent  large  space-bandwidth  product  [5], 

Optical  couplers  on  the  basis  of  tuimel  coupled  waveguides  are  the  key  elements  in  a 
number  of  modulation,  commutation  and  logic  schemes.  In  designing  such  schemes  the 
problem  of  a  cross-talk  minimization  holds.  It  was  shown  [6]  that  the  cross-talk  decreased 
together  with  the  waveguide  coupling.  But  this  decrease  leds  to  the  sharp  increase  of  a  device 
length.  The  cross-talk  in  electrooptic  switches  can  be  compensated  also  by  using  a  two- 
sectional  (Db)  electrode  scheme  with  the  sections  of  different  length  [7].  However,  to  get  the 
complete  switching  from  a  waveguide  to  another  waveguide  (Figure  1(a)),  the  opposite 
polarity  voltage  is  needed.  Such  peculiarity  restricts  the  device  usefulness  in  some  schemes. 
The  cross-talk  can  be  minimized  through  optimization  of  adjusting  sections  in  which  bent 
single-mode  waveguides  are  used  [8,9].  A  loss  inequality  of  the  even  and  odd  modes  at  the 
adjusting  sections,  however,  was  not  taken  into  account. 

In  this  paper  we  show  that  this  inequality  is  a  main  cause  of  the  cross -talk. The 
dependence  of  the  switch  cross-talk  on  the  adjusting  section  parameters  is  analyzed.  An 
adjusting  sectional  scheme  which  permits  to  eliminate  the  crosstalk  is  proposed.  A  device  on  a 
base  of  diffused  channel  waveguides  is  analyzed. 

2.  Cross-talk  minimization  and  directional  coupler  design 
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2,1  Theory 

Wc  deal  with  weakly  inhomogeneous  waveguides.  Modes  of  the  waveguide  are  desoibcd 
approximately  by  the  scalar  equation 

. 

— -  +  — -  +koJp(x,y,z)=0 

dx^  dx^ 

which  is  solved  for  y<0  with  the  boundary  condition  Vl>'=o=0  (y=0  is  the  substrate  surface 
equation,  is  wave  number  of  free  space).  A  waveguide  structure  with  the  plane  of 
symmetry  as  shown  in  Fig.l,  and  the  waveguides  at  input  (z<0),  output  (z>l+2L)  and  at 
the  coupling  section  be  parallel  and  single-mode,  as  shown  in  Fig. 1(b).  We  use  following 
expressions  for  the  fields  of  the  even  and  odd  modes  of  the  waveguide  structure: 

^>ei=Siq)eiix,y)<(Pei>'^^\  rpoi=Ai(poi(x,y)<^oi>'^'\  2:<0,  (2) 

7pe=S  cpe{x,y)<(p}>-^’^,  rpo=A  (Poix,y)«po>'^’^ ,  l+L  z<L,  (3^ 

rpet=St(Pct(x,yi(Pe^'^'\  H’ot^Atq^otix.yicpl^^'^  ,  z>2L+l,  ^4) 


where  S-  and^^-  are  amplitudes,  are  normalized  constants,  are  transversal 

distributions  of  mode  fields. 


We  consider  a  device  in  terms  of  diffused  channel  waveguides.  We  estimate 

propagation  constants:  Pi=(hi  ko“-eo)AE  ^  ^  which  are  shown  in  Table  1.  For  weakly  guiding 
waveguides,  we  have 

hi  - hj  -l3j)^£{2iJs'^)  (5) 

Using  Table  1  and  Eq.  5,  we  come  to  and  dependence  presented  in  Fig. 2. 
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Table  1  Propagation  constants. 


Pe“Po 

Pee"Poe 

Pee“Peo 

Peo-Poc 

0.01  70 

0.1  500 

0.0046 

0.1  842 

0.8540 

0.8890 

The  condition  rol=rei  =0,983  holds  at  F=2,8664.  In  the  case  =-0,162  and  PPo^=0,935  (- 

0,29dB).  Sufficient  conditions  of  zero  cross-talk  are'^oi”*^®!’  ro3=re3,  exp[i(A<t>i-A(|)3)]=l 
addition  x  =0  for  the  "cross"  state  and  for  the  "bar"  state  x  is  found  from  an  equation 
ti  1  +exp(-i  Ahl)Tj  1  A=0  _ 


voltage,  V 

Fig.  2  Optical  output  power  of  the  directional  coupler  vs.  input  voltage. 

1  -novel  type  of  directional  coupler;  2  -traditional  type  of  directional  coupler. 


2,2  Experiment 

In  our  experiments,  titanium  is  deposited  in  a  high  vacuum  plant  through  vapour 
deposition  to  a  thickness  of  40nm  The  Ti  indiffusion  is  carried  out  at  1040'’C  for  4/i.  MgO  is 
deposited  nonreactively  through  vapour  deposition  at  a  rate  of  20  nm/min  to  avoid 
dissociation.  The  strip  thicknesses  are  30nm  indiffusion  is  carried  out  for  2h  at  910'’C  with  a 
flow  of  dry  synthetic  air  at  a  rate  of  2,5  1/min. 

The  fabrication  parameters  for  the  two-mode  waveguides  were:  titanium  strip  width 
IT=4mm  and  coupler  length  T=8mm.  In  the  initial  experiments  the  two-moded  intersection 
region  was  realized  by  increasing  the  width  of  the  center  titanium  stripe  to  2W  and  keeping  the 
titanium  thickness  constant.  In  the  center  titanium  stripe  a  MgO  stripe  (2mm  width  and 
thickness  30nm)  was  deposited  and  the  second  indiffusion  was  made. 

The  directional  coupler  was  fabricated  using  quartz  buffer  layers  and  Cr-Au  electrodes. 
The  polished  waveguide  edge  faces  were  butt-coupled  to  polarization-preserving  fibers  using 
index-matching  epoxy.  The  far- end  cross-talk  attenuation  or  on/off  ratio  varies  between  19  and 
29dB.  For  the  8mm  coupling  length  device  and  2,5mm  gap,  more  then  80%  efficiently  was 
obtained  (wavelength  is  1,3mm  and  12V  switching  voltage).  Optical  output  powers  are 
plotted  for  TE  and  TM  modes,  with  waveguide  gap  of  2,5mm  in  Fig. 4. 

3,  Optical  multistage  interconnection  network  architecture 

A  popular  method  for  realizing  the  switching  or  commutator  fabric  is  the  Batcher  or 
Batcher-banyan  network  [10].  This  multistage  network  where  each  stage  performs  a  fixed 


640 


permutation  on  the  incoming  lines,  and  then  routes  them  through  a  column  of  2x2  switching 
elements.  According  to  the  standard  bit-controlled  routing  algorithm  for  these  networks, 
switches  in  the  &st  stage  are  controlled  by  the  most  significant  bit  of  the  destination  tag,  those 
in  the  second  by  next  bit,  and  so  one.  By  convention,  the  control  bit  of  the  top  input  decides 
the  state  of  the  switching  elements.  If  it  is  zero,  the  switch  is  the  straight  connection  state; 
otherwise  it  is  crossed.  These  switches  route  the  input  with  the  lower  tag  to  the  top  output  and 
the  one  with  the  higher  tag  to  the  bottom  output. 

A  strictly  nonblocking  crossbar  switch  can  always  establish  a  connection  between  an  input 
and  output  without  prejudice  to  connections  already  made.  This  architecture  requires  n 
switches.  In  a  guided  wave  device  with  a  large  aspect  ratio  there  are  2n-l  switch  cell  lengths 
from  end  to  end  of  the  switch.  Such  an  architecture  can  have  fewer  switches  and  be  shorter 
then  the  equivalent  strictly  nonblocking  switch.  We  have  designed  a  switch  based  on  the  23 
switch  8x8  design  shown  in  Fig. 3. 


mzp-7 


Fig. 3.  Nonblocking  8x8  integrated -optical  switch  matrix  design. 
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Abstract.  We  present  in  this  paper  an  experimental  and  numerical  analysis  of  photorefractive 
waveguides  fabricated  in  a  lithium  niobate  crystal.  A  parametric  model  of  3-D  refractive  index 
distribution  of  the  photorefractive  waveguide  is  used.  The  maximum  refractive  index  change  is 
estimated  at  Aumax  “  1 .2x10-3  by  comparing  the  near-field  patterns  in  the  rear  face  of  the  crystal 
that  are  calculated  by  the  beam  propagation  method  with  the  experimental  observations. 

1.  Introduction 

We  proposed  a  novel  approach  to  three  dimensional  optical  interconnections  for  optical 
neural  networks[l].  Photorefractive  waveguides  are  fabricated  by  simply  focusing  and 
scanning  a  laser  beam  in  the  photorefractive  material  as  shown  in  Fig.  1 .  Figure  2  shows  our 
concept  to  implement  high-density  optical  interconnections  for  neural  networks.  Resultant 
waveguides  with  variable  index  profiles  may  be  used  for  the  optical  dynamic 
interconnections.  We  believe  in  the  possibility  that  high-density  optical  interconnections  with 
a  self-organization  ability  be  realized  by  this  approach.  In  the  preliminary  experiment,  the 
photorefractive  waveguides  were  successfully  fabricated  by  scanning  a  focused  laser  beam  in 
a  lithium  niobate  (LN)  crystal  and  the  guided  light  on  the  rare  face  of  the  crystal  was  clearly 
observed.  We  have  proposed  two  dimensional  model  of  refractive  index  distribution  of  the 
photorefractive  waveguide. [2]  The  accurate  knowledge  of  the  three  dimensional  refractive 
index  distribution  of  the  photorefractive  waveguide  is  necessary  to  evaluate  the  optical 
characteristics  of  the  waveguide.  However,  it  is  difficult  to  measure  the  complicated  structure 
of  the  three  dimensional  refractive  index  distribution  inside  the  crystal.  In  this  paper,  we 
propose  a  three  dimensional  model  for  the  refractive  index  change.  The  maximum  refractive 
index  change  is  determined  by  comparison  between  experimental  and  numerical  results  of  the 
near-field  patterns  of  curved  waveguides. 


Scanning  a  Photorefractive  Material 


Fig.  1  Fabrication  of  3-D  photorefractive 
waveguide. 


Fig.2  High-density  optical  interconnection 
using  photorefractive  waveguide. 
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2.  Model  of  3-D  refractive  index  distribution  of  photorefractive  waveguide 

We  propose  a  parametric  model  of  a  three  dimensional  refractive  index  distribution  of 
the  photorefractive  waveguide  that  is  based  on  the  measured  refractive-index  distribution  [1]. 
Figure  3(a)  shows  the  transverse  refractive  index  distribution  in  the  plane  perpendicular  to  the 
propagating  direction  of  a  straight  waveguide.  The  field  of  view  of  this  figure  covers  a 
rectangular  area  of  approximately  70  pm  x  70  pm.  In  Fig.  3(a),  the  transverse  axis  is  parallel 
to  the  c  axis  and  the  longitudinal  axis  is  parallel  to  the  b  axis  of  the  LN  crystal.  Bright  areas 
indicate  those  of  the  higher  refractive  index  and  the  dark  areas  those  of  the  lower  index.  A  pair 
of  waveguides  are  fabricated  at  both  sides  of  the  exposed  region  along  the  c  axis. 

Let  the  x-axis  be  parallel  to  the  c  axis  and  the  y-axis  be  parallel  to  the  b  axis  of  the  LN 
crystal,  and  the  z-axis  coincide  with  the  optical  axis  of  the  experimental  optical  system.  The 
proposed  model  for  the  refractive  index  distribution  of  a  photoref  active  waveguide  in  the  LN 
crystal  is, 

A  (  \  ^  V  /-(•^+c/+/(z))2  -(y+^/+g(2))^  \ 

An(x,y,z)  -  a  I  a;  exp  ^  ,  (1, 

i—  1 

where  a  is  a  variable  parameter  and  depends  on  the  power  of  fabrication  beam,  a/,  bi ,  c/ ,  di, 
and  ei  are  determined  by  fitting  An(x,y,z)  to  the  experimental  data,/(z)  and  g(z)  denote  the 
deviation  of  the  focused  spot  with  regard  to  the  x-axis  and  y-axis,  respectively.  Figure  3(b) 
shows  the  model  refractive  index  distribution  fitted  to  the  experimental  result  shown  in  Fig. 
3(a).  By  changing  f(z)  and  g(z),  we  can  design  varieties  of  3-D  waveguides. 


Fig.  3  Refactive  index  distributions  in  the  plane  perpendicular  to  the  propagating 
direction  of  a  straigth  waveguide;  (a)  the  experimental  result,  and  (b)  the  model  fitted  to 
the  experimental  result  arc  displayed. 

3.  Experimental  and  numerical  results 

The  experimental  setup  is  shown  in  Fig.  4.  A  LN  crystal  that  nominally  contains 
impurities  less  than  10  ppm  (The  Fe  ion  is  less  than  several  ppm.).  The  thickness  of  crystal  is 
approximately  2.0  mm.  A  linearly  polarized  argon  ion  laser  beam  (X=514.5nm)  is  used  to 
fabricate  the  waveguide  and  to  excite  the  guided  light.  When  we  fabricate  the  photorefractive 
waveguide,  the  electric  field  vector  of  the  argon  ion  laser  is  set  perpendicular  to  the  c  axis 
(ordinary  ray) .  When  the  light  is  to  be  guided,  a  half-wave  plate  is  inserted  to  excite  the  extra¬ 
ordinary  ray  and  the  optical  beam  power  is  reduced  to  approximately  1/100  than  that  of  the 
fabrication  beam.  An  argon  ion  laser  beam  is  focused  by  a  microscope  objective  lens  LI .  The 
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numerical  aperture  of  the  focused  beam  is  approximately  0.09.  To  fabricate  the  3-D 
waveguide,  the  longitudinal  position  of  the  focus  and  the  lateral  position  of  the  crystal  are 
controlled  by  the  translators.  The  microscope  objective  lens  L2  is  placed  behind  the  crystal  for 
observation  of  near-field  pattern  on  the  rear  face  of  the  crystal.  The  near-field  pattern  is 
obtained  by  the  CCD  image  sensor. 


Fig.  4  Optical  system  for  fabricating  and  testing  the  Fig.  5  Near-field  pattern  of  the  curved 

photorefractive  waveguides;  P  denotes  polarizer,  S  waveguide. 

shutter;  HWP  half  wave-plate,  TR  translator,  PC 

personal  computer,  L's  microscope  objective  lenses, 

respectively. 
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Fig.  6  Numerical  results  of  near-field  pattern  in  the  case  of  (a)  Anmax=  1 .2xl0‘^and  (b)  An^3j^= 
0.9x1 0■^  and  (c)  a  simultaneous  plot  of  these  intensity  distributions  taken  along  the  central 
line  parallel  to  the  c  axis.  The  solid  and  broken  curves  indicate  patterns  taken  with  Anmax  = 
1.2xl0‘^and  Anmax  =  0.9x10'^,  respectively. 


We  compared  the  results  of  a  numerical  analysis  of  near-field  patterns  of  the  curved 
waveguides  with  the  experimental  results.  In  the  curved  waveguide  with  the  insufficient 
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refractive  index  change,  a  part  of  guided  light  may  be  radiated  neat  the  curved  portion.  Thus, 
an  accurate  index  distribution  of  the  waveguide  can  be  obtained  by  this  comparison.  We 
fabricated  an  s-shaped  waveguide  whose  transverse  displacement,  |  f(zin)  -  f(Zout)  i » between 
the  input(zin)  and  output(zout)  ends  of  the  waveguide  is  30jxm  [g(zin)=g(zout)]>  We  used  the 
waveguide  that  is  located  at  the  negative  side  along  the  x-axis.  The  experimental  result  of 
near-field  pattern  of  the  curved  waveguide  is  shown  in  Fig.  5.  The  field  of  view  of  this  figure 
covers  a  rectangular  area  of  approximately  50  jxm  x  85  jim.  The  radiation  loss  along  the  x- 
axis  is  almost  negligible.  In  numerical  simulation,  various  near-field  patterns  are  calculated 
using  the  beam  propagation  method[3]  on  the  assumption  that  the  incident  beam  is  a  Gaussian 
beam.  Figure  6  shows  the  numerical  result  of  near-field  patterns  of  curved  waveguide  in  the 
case  of  ^nmax  “  1.2x10-3  and  0.9x10-3  ,  respectively.  The  intensity  distribution  of  guided 
beam  in  the  curved  waveguide  on  the  Jt-z  plane  is  shown  in  Fig.  7.  In  the  case  of  the  lower 
value  of  a  part  of  the  guided  light  is  radiated  along  the  x-axis  at  the  curved  portion  of 

the  waveguide.  The  result  of  ^max  =  1 .2x  1 0-3  shows  agreement  with  the  experimental  result 
except  that  the  width  of  the  intensity  profile  along  the  jc-axis  is  narrower  than  experimental 
result.  The  maximum  refractive  index  change,  ^nmax»  is  estimated  to  be  equal  to  or  larger 
than  1.2x10-3. 

4.  Conclusion 

We  proposed  the  three  dimensional  refractive  index  distribution  of  photorefractive  waveguide 
and  estimated  the  maximum  refractive  index  change  by  making  a  comparison  between  the 
experimental  results  and  numerical  analysis.  The  comparison  showed  that  ^^max  is  equal  to 
or  larger  than  1.2x10-3  . 
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Abstract.  The  method  of  enhancing  the  information  ca¬ 
pacity  and  reliable  storage  of  information  in  circula¬ 
tor  fiber-optic  memory  is  presented.  The  method  is 
based  on  creation  of  an  additional  built-in  channel 
with  contrary-directed  circulation  of  signals.  This 
channel  can  be  used  for  transmission  of  both  informa¬ 
tion  and  auxiliary  signals:  address  words,  clock 
signals,  correcting  sequences,  etc.  The  possibility  of 
compensating  the  information  signal  losses  be  means  of 
stimulated  Raman  scattering  is  considered. 


1.  Introduction 

Increasing  of  the  information  capacity  of  a  fiber-optic  memory 
device  (FOMD)  is  normally  achieved  either  by  increasing  the 
length  of  fiber  circulation  loop  accompanied  by  added  complexi¬ 
ty  of  the  synchronization  circuit  or  by  using  conventional  me¬ 
thods  of  channel  multiplication  typical  of  fiber-optic  commu¬ 
nication  links, i.e.  spectral , polarization  and  temporal  methods 
[1,2].  However,  closed  nature  of  circulator  memory  systems  ma¬ 
kes  it  also  possible  to  create  additional  information  chan¬ 
nels  that  can  be  employed  for  attaining  both  reliable  addres¬ 
sing,  synchronization  and  noise-immune  encoding  and  increased 
information  capacity  of  the  FOMD  [3,4].  In  this  case, the  use 
of  traditional  multiplication  and  synchronization  techniques 
is  not  excluded. 

This  paper  is  concerned  with  the  proposed  method  of 
increasing  the  information  capacity  and  reliability  of  infor¬ 
mation  storage  in  a  circulator  fiber-optic  memory  by  creating 
an  additional  built-in  channel  with  contrary-directed  signal 
circulation. 


2.  Organization  of  the  contrary-directed  channel 

In  accordance  with  the  method  proposed,  two  contrary-directed 
circulation  channels  -  main  and  additional  ones  -are  organized 
in  a  single  fiber  FOMD  loop  (generalized  structural  diagram  of 
the  FOMD  is  shown  in  Fig.l).  Information  signal  sequence 
circulates  in  the  main  channel,  while  the  additional  channel 
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Fig  1.  Structural  diagram  of  the  fiber-optic  memory  loop  with 
contrary  directed  circulation  of  the  signals.  Here,  PD  is 
photodetector,  M  is  modulator,  ASU  is  unit  for  creation  of 
auxiliary  signals,  C  is  commutator,  CIU  is  unit  for  correction 
of  information, L  is  laser, lOU  is  information  input/ output  unit, 
FC  is  fiber-optical  coupler,  CG  is  generator  of  the  clock 
pulses,  AF  is  address  former,  DC  is  digital  comparator. 


can  be  used  to  transmit  both  part  of  the  information  sequence 
and  auxiliary  signals  (verifying,  synchronizing  and  addressing 
words)  formed  by  the  auxiliary  signal  unit  (ASU). In  the  latter 
case,  optical  signals  circulating  in  both  channels  can  be 
transmitted  at  both  the  same  and  different  carrier  frequencies 
In  the  case  shown  in  Fig. la, the  additional  channel  serves 
to  transmit  a  verifying  correcting  sequence  and  ensures 
enhanced  reliability  of  storage  of  the  main  information 
sequence.  The  input  information  sequence  {a)  is  subjected  to 
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convolutional  symmetrical  coding  and  is  converted  to  the 
corresponding  sequences  of  information  {b  }  and  test  {c  } 
optical  signals.  In  this  case, the  commutatoi*  is  switched  °^to 
position  1  (see  Fig.  la)  ,  and  the  ASU  acts  as  a  convolutional 
encoder  consisting  of  two  series  connected  phase  modulators 
whose  control  voltage  is  set  equal  to  zero  or  half-wave  value 
depending  on  the  value  of  the  information  signal. 

Separated  information  and  test  signal  flows  arrive 
simultaneously  in  the  fiber-optic  circulation  loop  through 
couplers  placed  on  its  opposite  ends.  The  contrary-directed 
flows  {b  }  and  {c  )  also  emerge  simultaneously  at  the  loop 
outputs  °^and,  be ing° ^converted  to  the  corresponding  sequences 
of  electrical  signals  and  {c  ^  ,  are  used  for  decoding 

information  and  correcting  errors® without  any  demultiplexing 
required. The  corrected  information  sequence  is  again  subjected 
to  encoding. 

The  use  of  the  contrary-directed  channel  in  the  fiber¬ 
optic  circulator  makes  it  possible  to  increase  error-free 
storage  probability  owing  to  a  two-fold  reduction  of  the 
character  repetition  rate  in  the  optical  tract  or  to  achieve 
a  two-fold  increase  in  the  stored  information  volume  at  the 
same  clock  frequency. 

To  ensure  a  reliable  access  to  any  segment  of  circulating 
information  sequence  and  to  improve  its  storage  security,  the 
additional  channel  can  be  used  as  an  addressing  channel.  In 
this  case  (Fig. lb),  writing  of  the  information  sequence  and 
the  auxiliary  (addressing)  sequence  into  the  loop  memory  is 
achieved  by  feeding  them  synchronously  to  the  information  (Inf. 
input)  and  auxiliary  (aux. input)  input  terminals  of  the  input/ 
output  unit  (lOU) ,  respectively,  and  further  to  control  input 
terminals  of  the  corresponding  electrooptical  modulators.  As 
the  addressing  sequence  circulates  in  closed  loop,  its 
segments  (address  words)  are  also  synchronously  written  into 
an  n-bit  address  former  (AF)  thereby  forming  the  address  of  a 
coming  information  sequence.  Synchronous  circulation  of  signal 
flows  along  identical  channels  allows  simultaneous  feeding  of 
the  lOU  input  terminals  independent  of  any  external  influences. 

Access  to  the  required  information  signal  (rewriting,  re¬ 
trieval)  isrealized  through  setting  appropriate  address  on 
the  address  inputs  of  a  digital  comparator  (DC) .  When  the 
contents  of  the  AF  coincides  with  the  set  address  during  the 
next  passage  of  the  address  sequence  along  the  circulation 
loop,  the  DC  generates  a  control  signal  used  to  strobe 
information  signals. 

In  order  to  reduce  error  probability  in  forming  address 
words, it  is  advantageous  to  use  a  pseudo-random  sequence  (PRS) 
as  address  sequence. The  PRS  length  is  M  =  -  i, where  W  is 

the  PRS  generator  bit  capacity, and  is  determined  by  the  memory 
information  capacity;  the  address  former  represents  an  n-bit 
shift  register,  where  n  is  not  less  than  N.ln  case  a  redundant 
length  of  the  PRS  address  words  is  used, i.e. for  n  >  N,  addres¬ 
sing  is  free  of  errors  even  if  the  received  address  word  has 
several  incorrect  ("spoilt")  bits. 

The  additional  channel  can  also  be  used  for  transmission 
of  a  sequence  of  clock  signals  whose  time  parameters  are 
matched  with  those  of  the  information  signal.  In  this  case, the 
information  and  synchronizing  sequences  circulate  in  synchro- 
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nism  in  opposite  directions  along  the  same  loop, which  excludes 
their  mistiming  caused  by  the  influence  of  external  temperature 
and  mechanical  factors  on  the  fiber-optic  tract  thus  improving 
reliability  of  the  information  signal  synchronization.  In  this 
case,  the  ASU  represents  a  modulator  that  is  identical  to  the 
main  channel  modulator  controlled  by  the  clock  pulse  generator 
(commutator  in  Fig. la  is  switched  to  position  2).  The  clock 
signals  at  the  output  end  of  the  additional  channel  are  used 
for  synchronous  detection  of  the  output  information  sequence. 
The  repeated  input  of  the  restored  information  signals  into 
the  circulation  loop  is  performed  synchronously  with  pulses  of 
a  quartz  clock. 

The  contrary-directed  channel  in  the  fiber-optic  circula¬ 
tor  can  be  employed  for  amplification  and  compensation  of 
attenuated  information  signals  propagating  in  the  light  guide, 
making  an  electrooptical  regenerator  superfluous .Moreover,  by 
closing  the  loop  with  the  couplers,  an  all-optical  FOMD  with 
multiple  propagation  of  signals  in  the  loop  is  obtained.  Even 
if  the  electrooptical  regenerator  is  employed,  optical 
amplification  allows  formation  of  a  considerably  longer  loop 
and,  hence,  obtaining  of  a  higher  FOMD  capacity. 

Optical  amplification  in  a  fiber  can  be  accomplished  by 
using  various  nonlinear  optical  effects,  for  example, stimulat¬ 
ed  Raman  scattering  (SRS)  [5]. 


3.  Conclusions 

The  bi-directional  circulation  regime  proposed  makes  it  poss¬ 
ible  to  find  an  effective  solution  to  the  problems  of 
increasing  the  information  capacity  and  attaining  reliable 
storage  of  information  in  the  FOMD  circulation  loop. Moreover, 
it  allows  optical  amplification  of  information  signals  to  be 
achieved.  Amplification  of  information  signals  can  be 
effected  by  using  both  continuous  wave  pumping  and  synchroniz¬ 
ing  signals  of  appropriate  power.  Thus,  the  contrary-directed 
channel  can  be  simultaneously  used  for  both  synchronization 
and  amplification  of  information  pulses.  The  required  optical 
power  of  the  pump  wave  is  of  the  order  of  some  hundreds  of 
milliwatts  [4]. 

This  work  has  been  financially  supported  from  the 
Fundamental  Research  Foundation  of  the  Republic  of  Belarus. 
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Abstract.  A  scheme  of  a  very  fast  way  to  make  all-optical 
arithmetic  operations  realized  in  a  nonlinear  planar  waveguide  is 
presented.  The  scheme  is  based  on  interaction  properties  of  solitons. 


1.  Scheme  of  the  device 


Our  scheme  is  based  on  a  couple  of  two  propagating  spatial  solitons  in  a  nonlinear 
waveguide,  under  the  hypotesis  of  transverse  confinement  realized  by  a  third  order 
nonlinear  medium,  and  where  the  relative  phase  is  properly  varied,  so  that  the 
interaction  properties  of  solitons  can  be  utilized  to  perform  the  required  process. 

Our  analysis  starts  from  an  adimensional  nonlinear  Schroedinger  equation^ : 


(1) 


.  ^  _  1 
2  ^ 

Under  proper  conditions  the  propagation  is  described  from  the  following  system  of 
coupled  differential  equations^: 


=  ~4exp(-2^)cos(20) 

=  4exp(-2^)sin(20)  (2) 


where  q=q(z)  is  the  relative  distance,  along  the  propagation  direction,  at  the 
generically  z  coordinate,  and  O  is  the  relative  phase  of  solitons. 

In  Ref  [3]  the  influence  is  studied  of  the  change  of  relative  phase  of  two  input  solitons, 
of  equal  input  intensity,  on  the  intensity  of  the  soliton  whose  phase  has  not  been 
changed.  The  intensity  is  found  to  vary  according  to  the  law: 

/„  =  *,<!>  (3) 

where  kj  is  a  constant  depending  on  initial  distance  with  respect  to  the  other  soliton. 
This  means  that  the  phase  information  carried  by  a  pulse  can  be  transferred  to  the 
intensity  of  another  pulse  after  propagation  (intensity  modulation  by  change  in  phase ). 
By  the  help  of  this  property  all  arithmetic  operations  can  be  realized.  For  example  the 
adder  can  be  realized  when  two  soliton  beams  enter  in  the  same  waveguide  with  a 
relative  phase  equal  to  zero  and  their  interaction  in  the  first  part  of  propagation  is 
avoided  by  means  of  a  proper  shield.  A  beam  of  intensity  /,  propagated  normally  to 
the  soliton  path,  changes  via  a  nonlinear  interaction  the  refractive  index  of  a  lenght  L 

of  the  soliton  path  (see  Fig.  I),  inducing  a  phase-change  equal  to  k^LI^.  A  second  beam 
of  intensity  interacts  with  the  same  soliton  for  the  same  length  L  inducing  a  further 
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phase  change  equal  to  k^Ll^ .  The  total  phase  change  is  k^L{j^  +/2)-  "ow  we  let  the 
two  solitons  to  interact,  they  interfere  according  to  the  intensity  modulation  effect,  and 

the  intensity  of  one  soliton  becomes  4  =  k.k^L[l^  +1^),  that  is  proportional  to  the  sum 
of  intensities  of  input  beams,  realizing  an  optical  addition. 

In  a  similar  way  it  is  possible  to  realize  an  optical  subtraction.  In  fact  if  we  consider  the 
same  structure  of  the  adder  where  the  zone  interested  by  the  second  beam  1 2  is 
composed  by  a  negative  nonlinear  refractive  index  coefficient,  the  phase  change 

induced  over  the  soliton  beam  is  equal  to  -k^Ll2.  The  intensity  of  soliton  after 
interaction  becomes:  /g  =  k^k^L{l^  -  A)- 

The  division  is  made  in  a  quite  similar  way  by  using  only  a  beam  of  intensity  /, .  If  we 
reduce  the  length  L  by  a  factor  N,  the  phase  change  induced  over  the  soliton  is  equal 
to  k/JJN,  and  the  output  intensity  becomes  4  =  k.k^I^L/N.  This  process  is  quite 
critical  since  it  depends  of  our  capability  of  reducing  the  length  L  properly.  See  figs.  2. 
The  multiplication  is  made  by  increasing  N  times  the  length  L,  so  that  the  output 

intensity  becomes  Iq  =  k^k^LNI^ . 


2.  Dimensioning  of  the  device 

To  dimension  these  devices  it  is  necessary  to  match  different  parameters.  The  medium 
ought  to  have  a  large  nonlinear  refractive  index  n2  to  use  low  power  level,  with  a  fast 
response  time  to  make  a  quick  device. 

Once  chosen  the  Kerr  medium  the  linear  and  nonlinear  refractive  indices  are 

determinated  and  the  wavelength  Ag  necessary  to  excite  the  nonlinearity. 

The  structure  of  the  waveguide  determines  the  spot  size  necessary  to  excite  the 
minimum  number  of  modes.  The  ideal  situation  is  the  excitation  of  only  the 
fundamental  mode. 

The  power  necessary  to  excite  a  soliton  beam  is  thus: 

P,  =  -4^.  (4) 

Aa^n^n2 

Since  the  material  is  characterized  by  a  well-defined  response  time  of  the  nonlinearity, 
the  duration  of  the  pulses  emitted  by  the  light  source  must  be  greater  or  almost  equal 
to  this  response  time. 

The  intensity  modulation  effect  strictly  depends  of  the  initial  soliton  beam  spot  size^:  to 
let  the  effect  take  place  it  is  necessary  to  make  the  two  soliton  propagate  for  a  distance 
almost  equal  to  ten  times  the  beam  width:  we  use  a  security  coefficient  equal  to  ten. 

The  time  of  the  intensity  modulation  effect  is  immediately  given  as: 

7^  =  100^.  (5) 

c 

The  phase  varying  soliton  that  crosses  through  the  zone  enlighten  by  the  input  beams 
experiences  a  phase  variation  with  respect  to  the  other  soliton  equal  to: 

A(p  =  An-k-  L,  (6) 
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where  A«  is  the  induced  refractive  index  variation,  k  is  is  the  wavevector  in  the 
medium  and  L  the  computation  length. 

Since  the  intensity  modulation  by  phase  change  effect  is  linear  only  for  relative  phase 
values  included  in  the  range  7c/2^7i,  that  is  for  a  maximum  phase  variation  of  nil 
(provided  one  desires  to  use  all  the  phase  dynamic  available),  we  can  calculate  the 
computation  length  L  from  Eq.[6]  imposing  ^(p=  njl,  that  gives; 


a/2  _al 
k'An  Xq’ 

The  time  necessary  to  pass  through  L,  said  computation  time,  is: 


(7) 


j  _  ^0  ^0 


(8) 


The  time  expressed  by  Eq.[8]  is  the  minimum  time  reachable  for  a  phase  change  of  nil. 


since  it  is  calculated  for  a  controller  beam  power  equal  to  P^.  A  higher  power  of  the 
controller  beam  would  generate  a  transversal  soliton  that  collides  with  other  solitons 
deflecting  them  from  projected  trajectories  generating  malfunctions  of  the  device.  For 
this  reason  it  is  necessary  to  decrease  the  power  level.  Reducing  it  N  times  it  is 
immediate  to  see  that  the  computation  time  becomes  N  times  greater. 

Generally  it  is  sufficient  to  reduce  the  power  by  a  factor  of  two  without  appreciably 
slow  down  the  device. 


3.  A  practical  example. 

Let's  see  a  practical  example.  We  consider  a  waveguide  composed  by  a  thin  film  of 
€$2 ,  that  is  an  organic  compound  whose  nonlinear  mechanism  is  represented  by  the 
molecular-orientation  Kerr  effect  whose  response  time  is  of  about  1  ps.  The  optical 
parameters  are: 


n^  =  \.6,  (9) 

Since  the  material  is  liquid,  it  is  confined  by  means  of  two  parallel  quartz  plates.  The 
thickness  of  the  waveguide  is  of  about  10  pm  and  we  have  10  propagating  modes.  It  is 
possible  to  excite  the  fundamental  mode  if  a  narrow  converging  beam  is  used.  If  we 
use  a  spot  size  of  100  pm  with  a  A,=532  pm,  that  is  the  wavelength  at  which  the 
material  has  a  low  absorption  coefficient,  we  can  calculate  from  Eq.[4]  the  critical 
power  necessary  to  obtain  a  soliton  beam: 

P^  =  \-\0''Wlm\  (10) 

To  have  this  power  level  it  is  necessary  to  use  a  pulsed  source.  The  ideal  candidate  at 
?i=532  pm  is  a  Nd:YAG  laser.  Since  the  typical  pulse  duration's  of  this  laser  are  in  the 
range  10-^100  ps,  we  are  sure  that  the  medium  is  fast  enough  to  support  and  sustain  a 
soliton  beam. 

The  computation  time  is  given  by  Eq.[8]: 

Tp=\0ps.  (11) 

The  time  necessary  to  perform  a  single  addition  is: 

To  =  T^^lTp=%0ps  (12) 

The  repetition  rate  of  the  operations  depends  obviously  on  the  source  and  on  the 
thermal  transmission  features  of  the  heat  generated  by  the  absorption  of  medium. 
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4.  Conclusions 

We  presented  a  flexible  scheme  to  realize  all-optical  arithmetic  operations  that,  given 
two  input  beams,  is  able  to  generate  an  output  beam  whose  intensity  is  proportional  to 
the  sum  (or  to  the  difference  if  properly  designed)  of  the  input  intensities.  The  scheme, 
properly  modified,  is  able  to  realize  the  division  or  the  multiplication  of  the  intensity  of 
an  input  beam. 
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OF  INTENSITY  - 
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Fig.  1  Scheme  of  the  optical  adder. 


Fig.  2  Scheme  of  the  optical  divider. 
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Abstract.  Basic  properties  and  possible  applications 
to  optical  computing  of  new  types  of  spatial  and 
spatio-temporal  soli tons  and  their  bound  structures 
are  considered  in  the  following  nonlinear  optical 
schemes;  (a)  medium  with  nonlinearity  of  refractive 
index;  (b)  wide-aperture  laser  with  saturable 
absorption;  (c)  optical  fibre  with  sections  of  gain 
and  absorbing  media;  (d)  wide— aperture  nonlinear 
interferometer  - 


1 -  Introduction 

Optical  solitons  in  nonlinear  optical  fibres  have  long  been 
attracting  the  attention  of  investigators  in  connection  with 
their  applications  to  optical  communications,  superfast 
switchings  and  optical  computing  CH-  At  present,  the  number 
of  optical  soliton-like  structures  has  considerably  increased - 
Thus,  possibilities  of  applications  of  spatial  solitons 
generated  in  integrated  optical  (planar)  circuits  are 
discussed  C2Ii-  Attractive  are  the  opportunities  to  use 
three-dimensional  spatial  solitons  -  "light  bullets"  -  for 
information  processing  and  computing  C33-  A  number  of  schemes 
for  all-optical  information  processing  has  been  proposed, 
switching  waves  and  diffractive  autosol itons  (spatio-temporal 
solitons)  in  wide— aperture  nonlinear  interferometer  being  used 
there  C4-63 - 

In  the  present  work,  we  compare  general  properties  of 
bright  soliton-like  light  structures  in  a  number  of  nonlinear 
optical  systems,  mainly  considering  the  stability  of  their 
characteristics-  We  come  to  the  conclusion  that  these 
structures  are  most  stable  in  bistable  interferometers.  In 
this  connection,  we  analyze  more  thoroughly  the  all-optical 
scheme  of  full  adder  proposed  in  [53  and  essentially  using  the 
properties  of  diffractive  autosol i tons .  A  number  of  stages  of 
a  multidigit  summing  cycle  has  been  modelled  in  C73. 
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2-  Soli  tons  in  nonlinear  transparent  ^^ia 

Here  we  discuss  2-D  spatial  solitons  in  a  medium  with  satu¬ 
rable  nonlinearity  of  refractive  index.  Computer  simulations 
for  interaction  of  2-D  solitons  were  done  in  CS] .  Colliding 
nonfundamental  solitons  decay  into  fundamental  ones.  In  col¬ 
lisions,  fundamental  solitons  pass  through  each  other  without 
noticeable  changes,  if  the  interaction  region  is  sufficiently 
small.  In  the  mode  of  strong  interaction,  repulsion  of  soli- 
tons  with  partial  power  transfer  or  their  attraction  and  con¬ 
vergence  are  observed.  Due  to  continuous  spectrum  of  propaga¬ 
tion  constant  or  the  maximal  radiation  intensity  for  fundamen¬ 
tal  solitons,  noise  causes  the  drift  of  soliton  parameters. 

3.  Solitons  in  wide-apertur©  lasers  and  active  fibers 

For  lasers  with  saturable  absorption,  hard  excitation  of 
lasing  is  typical.  In  these  conditions,  generation  of 
switching  waves  and  laser  solitons  —  islands  of  lasing  over 
the  nonlasing  background  -  is  possible  for  wide-aperture 
lasers  C93.  The  maximal  radiation  intensity  for  a  laser 
soliton  has  a  certain  value,  for  which  radiation  losses  are 
compensated  by  saturable  gain. 

In  transversely  2— D  geometry,  there  are  stable  laser 
solitons  with  different  topological  charges  m  =  O  (regular 
wave  front),  and  m  =  ±1,±2,...  (vortices).  In  a  laser  with 
infinite  aperture,  there  are  laser  solitons  propagating  across 
the  aperture  with  arbitrary  velocity.  This  corresponds  to 
arbitrary  angle  between  the  axis  of  radiation  propagation  in 
the  soliton  and  the  cavity  axis.  In  an  actual  laser  with 
finite  aperture  a  soliton  is  reflected  by  the  mirror  edge,  if 
the  angle  between  the  axes  is  smaller  than  some  critical 
value.  The  result  of  collision  of  transversely  i-D  solitons 
depends  on  their  relative  velocity  and  the  phase  difference- 
With  the  relative  velocity  decrease,  the  following  regimes 
take  place  consecutively:  passing  of  the  solitons  through  each 
other;  generation  of  new  solitons;  convergence  of  the  solitons 
into  one  soliton;  their  repulsion. 

The  equation  for  propagation  of  pulses  in  one-mode  non¬ 
linear  fibre  with  intervals  with  saturable  gain  and  absorption 
coincides  with  equations  for  transversely  1— D  laser  with  satu¬ 
rable  absorption.  Therefore,  the  results  presented  above  are 
valid  for  this  case  also.  The  hard  type  of  lasing  excitation 
suppresses  radiation  noises,  and  the  fixed  value  of  the  maxi¬ 
mal  intensity  excludes  the  drift  of  the  soliton  energy.  This 
makes  laser  solitons  promising  for  optical  communications. 


4.  Diffractive  autosolitons  in  nonlinear  interf ercwneters 

In  a  wide-aper ture  interf erometer  excited  by  a  wide  beam  of 
cw—  external  radiation,  there  is  a  discrete  spectrum  of 
spatio- tempora  1  solitons  —  '^diffractive  autosolitons"  (see 
review  C&3)-  The  maximal  intensity  and  width  of  an 
autosol iton  have  definite  values.  Bound  states  of  diffractive 
autosolitons  have  discrete  spectrum  of  distances  between  them. 
Asymmetric  structures  move  in  the  transverse  direction  with 
constant  velocity  determined  by  the  intensity  of  the  holding 
radiation.  The  following  section  presents  consideration  of 
possible  use  of  properties  of  diffractive  autosolitons. 


656 


5-  Ca«(mter  simulations  of  optical  full  adder 

One  of  the  stages  of  the  all-optical  full-adder  operation 
proposed  in  C5,63  is  doubling  of  spatial  modulation  period  of 
holding  radiation.  In  the  digits  with  unities  in  both 

summands,  it  provides  generation  of  an  asymmetric  coupled 
structure  of  two  autosol itons  with  different  widths  from  the 
autosolitons  stored  in  independent  cells.  If  only  one  of  the 
summands  has  unity  in  the  considered  digit,  the  change  of 
holding  radiation  modulation  produces  necessary  shift  of  the 
autosoliton  representing  the  unity  to  the  center  of  the  digit 
for  its  future  processing. 

Fig.  1  presents  the  results  of  computer  simulations  for 
the  case  when  both  summands  have  unities  in  the  corresponding 
digit.  The  unities  are  stored  in  the  form  of  a  narrow  and  a 
wide  autosolitons  located  in  the  maximums  of  holding  radiation 
intensity  (Fig,  la).  The  holding  intensity  is  described  by 

I.  =  I.  +  I  ,  coBikix  -X  )}.  (i) 

i.  vO  mod  O 

where  (\  is  the  radiation  wavelength  and  I  is  the 

saturation  intensity) 

I.  =  0.951  ,  I  ,  =  0.021  ,  k  =  3.35  10"\"\  x  =  4200X . 

tO  eat  mod  cat  ‘  O  ^  ^  ^ 

In  the  considered  system,  autosolitons  move  in  the 
gradient  of  intensity  in  the  direction  of  higher  intensities. 
The  change  of  the  parameters  of  holding  radiation  modulation  to 

k  =  1.675- 10"®  ,  X  =  5100  X,  (3) 

o 

places  the  autosolitons  on  the  slopes  of  the  intensity  maximum 
and  makes  them  move  towards  each  other.  Fig.  lb  shows  the 

settled  intensity  distribution  in  100  round-trip  times  after 
modulation  switching.  This  spatial  structure  of  light  becomes 
almost  stationary  in  about  40  round-trip  times. 

Fig.  2  illustrates  the  shift  of  an  autosoliton  after  the 
change  of  holding  radiation  modulation.  Initially,  the 

autosoliton  is  stored  in  the  intensity  maximum  (Fig.  2a),  the 
modulation  being  given  by  (1)  with  parameters'  (2).  After 

changing  the  parameters  of  the  modulation  to  (3),  the 

autosoliton  moves  to  the  intensity  maximum,  reaching  it  in 
about  150  round-trip  times.  Fig.  2b  shows  the  intensity 

distribution  in  400  round-trip  times  after  the  switching - 

Thus,  computer  simulation  confirmed  the  possibility  of 
realization  of  all  stages  necessary  for  optical  adding  of 
multidigit  numbers  within  the  method  proposed  earlier. 
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Abstract 

The  asymmetric  dragging  interaction  between  three-dimensional  optical  solitons  may  allow 
cascadable,  phase-insensitive,  NOR  gates  with  gain  to  be  implemented  at  ultrahigh  speeds  in 
massively  parallel  three-dimensional  bit-level  systolic-array  architectures. 

In  media  that  have  both  a  self-focusing  non-linearity  (^2  >  0)  and  negative  group  velocity 
dispersion,  a  pulse  can  collapse  in  both  space  and  time  forming  a  stable  “light  bullet”  in  3-(-l 
dimensions. These  collapsed  optical  pulses  have  high  peak  powers  but  small  total  energies, 
making  them  attractive  for  nonlinear  optics  applications.  These  three  dimensional  solitons  can 
be  made  to  interact  to  produce  a  logic  gate  in  the  same  way  that  one-dimensional  spatial  (or 
temporal)  solitons  do.  We  are  investigating  the  asymmetric  interaction  between  two  orthogonally 
polarized  solitons  brought  into  coincidence  both  spatially  and  temporally  at  the  boundary  of 
a  nonlinear  medium  and  propagating  at  slightly  different  angles.  This  geometry  permits  a 
weak  signal  to  spatially  drag  a  strong  pump  by  well  over  a  beam  width,  allowing  them  both 
to  be  blocked  by  a  spatial  aperture. This  results  in  a  phase  insensitive  inverter  with  gain 
that  can  be  cascaded  to  implement  a  high  contrast  NOR  gate  that  transmits  an  uncorrupted 
pump  through  the  aperture  if  the  signals  are  not  present,  and  drags  the  pump  out  of  the 
spatial  aperture  so  that  it  is  blocked  if  they  are  present.  A  beam  propagation  simulation  of 
an  asymmetric  light-buUet  dragging  interaction  is  shown  in  Figure  1.  This  shows  a  dragging  of 
a  5/umx5/xmx4/im  1  pJ  pump  (/p=:60MW/cm^)  by  a  .25pJ  signal  (/5=15MW/cm^)  in  about 
0.7mm  of  propagation  distance  using  a  rather  large  saturating  nonlinearity  of  =  10“^®m^/V^. 
Simultaneous  multidimensional  dragging  by  two  signals  one  tilted  in  x  and  the  other  tilted  in  y,  is 
shown  in  Figure  2,  demonstrating  the  single  stage  implementation  of  a  NOR  gate.  The  signals 
are  in  the  same  polarization  and  interfere,  but  numerical  studies  show  that  the  assymetrical 
interaction  can  always  function  as  a  NOR  gate  for  any  relative  phase  of  the  two  signal  solitons. 

In  order  to  optimize  these  light  buUet  dragging  gates  without  the  huge  computational  over¬ 
head  of  the  3-t-l  dimensional  beam  propagation  simulations,  we  are  examining  the  properties  of 
ID  asymmetric  spatial  soliton  dragging  interactions  W,  as  illustrated  in  Figure  3.  The  figure 
illustrates  asymmetric  soliton  dragging  showing  propagation  of  a  pump  alone  on  the  left  passing 
through  a  spatial  aperture,  the  dragging  of  a  pump  by  a  signal  in  the  middle,  and  a  signal 
alone  propagating  at  an  angle  on  the  right.  The  top  plane  represents  the  spatial  evolution  of 
the  intensity  profile  for  the  pump  polarization  and  the  bottom  plane  represents  intensity  for  the 
orthogonal  signal  polarization.  The  inset  schematically  shows  the  waveguide  device  geometry  of 
an  inverter  in  both  states. 

As  an  example  we  illustrate  the  output  contrast  ratio  (wrt  the  fundamental  soliton  power) 
in  a  two-level  system  saturating  nonlinear  medium  (in  this  case  Isat  Is  4  times  the  nonsaturated 
fundamental  soliton  peak  intensity),  as  a  function  of  interaction  angle,  propagation  distance 
and  pump-to-signal  beam  ratio  in  Figure  4.  It  is  clear  that  high  gain  and  good  contrast  are 
achievable  when  linear  resolvability  1  for  propagation  distances  >  5. 

Layered  sandwiches  of  nonlinear  media,  aperture  arrays  and  linear  media  suggest  the  possi- 
bilty  of  computing  in  3-D  with  asymmetric  light  bullet  dragging  gates.  But  getting  the  signals 
and  pumps  to  the  desired  interaction  sites  without  disruption  by  unwanted  signals  may  be  dif¬ 
ficult.  It  may  be  necessary  to  systolisize  at  the  bit  level  in  3-dimensions  by  pulsing  the  clock 
pumps  and  signals  so  that  they  pass  through  each  other  in  intervening  layers  of  linear  media 
until  arriving  at  the  desired  dragging  logic  site,  and  to  program  the  functionality  of  the  array 
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of  logic  gates  by  the  presence  and  absence  of  clock  pumps  in  the  space-time  lattice  of  possible 
light  bullet  locations.  An  example  of  a  1-D  spatial  soliton  dragging  logic  cascaded  majority  logic 
circuit  is  illustrated  in  Figure  5,  demonstrating  that  more  complex  functions  than  simple  NOR 
gates  can  be  implemented  without  intervening  interconnections. 

Ultrafast,  massively  parallel,  low  latency,  all  optical  NOR  gates  with  gain,  cascadability, 
input-output  isolation,  and  phase  insensitivity  have  been  proposed,  numerically  demonstrated 
and  parametrically  optimized.  This  optical  switching  interaction  opens  up  new  architectural 
possibilities  for  computing  in  3-11  dimensions  that  may  allow  the  realization  of  volume  parallel 
digital  optical  computers. 

The  authors  acknowledge  support  of  the  NSF  young  investigator  program  ECS  9258088  and 
a  DoD  feUowship  DAAL03-92-G-0351. 
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Figure  1:  Light  bullet  dragging  resulting  in  a  contrast  of  32.  In  the  bottom  frame,  the  1/4  pj 
signal  soliton,  rendered  as  a  solid  isosurface  at  1/10  of  its  peak  intensity,  is  initially  overlapping 
the  1  pJ  pump  light  bullet,  represented  by  a  grid  isosurface  at  the  same  intensity.  As  the  two 
sohtons  propagate  upward,  with  the  signal  tilted  at  an  initial  four  degree  angle,  the  pump  is 
pulled  out  of  the  10  micron  aperture  to  implement  the  inversion  or  switching  operation. 


Figure  2:  Single-stage  two-input  NOR  light  bullet  dragging  logic  gate.  All  parameters  are  the 
same  as  the  previous  figure,  but  now  two  signals,  each  tilted  at  an  initial  four  degree  angle  in 
two  orthogonal  directions,  interact  with  the  pump  and  drag  it  on  the  diagonal  between  them. 
The  two  signals  are  at  nearly  the  worst  case  of  5/6  tt  out  of  phase. 


